The context: Why is this problem important to solve?
The objectives: What is the intended goal?
The key questions: What are the key questions that need to be answered?
The problem formulation: What is it that we are trying to solve using data science?
There are a total of 24,958 train and 2,600 test images (colored) that we have taken from microscopic images. These images are of the following categories:
Parasitized: The parasitized cells contain the Plasmodium parasite which causes malaria
Uninfected: The uninfected cells are free of the Plasmodium parasites
Malaria is a disease affecting tropical countries.The disease is predominant in several under developed countries and due to the effects of climate change there is concern that the disease will spread locally as well.The disease can be cured but it is one of the deadliest disease affecting humans.I personally have been infected several times during my childhood days in Nigeria and this project reminds me about my past struggles with this disease.
The intended goal of the project is to properly diagnose the disease in the early stages and to mitigate the severity and spread.This is crucial as the disease always require proper treatment.
Data science can play a crucial role,especially in poor countries where there is a lack of medical resources and professionals that can help diagnose the disease in its infancy and ready the patient to take necessary treatment. In developed countries,convolution neural network models can be trained to become very efficient in early detection of the parasite and could help relax the burden on medical professionals in the event of an emergency or lack of medical resources which we experienced during the latest covid pandemic.
The key questions that need to be answered is how easy we can collect the data that is needed to train our model,how consistent and reliable the data is and if we could gather large population for all the classes that we are trying to learn and train our model.It is also important to understand any bias in the data and take action to remove bias or mitigate its effect.
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
# Library for creating data paths
import os
# Library for randomly selecting data points
import random
# Library for performing numerical computations
import numpy as np
# Library for creating and showing plots
import matplotlib.pyplot as plt
# Library for reading and showing images
import matplotlib.image as mpimg
import seaborn as sns
# Importing all the required sub-modules from Keras
from keras.models import Sequential, Model
from keras.applications.vgg16 import VGG16
from keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import img_to_array, load_img
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, BatchNormalization, Dropout
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)
Note:
#import zipfile
#zip_ref = zipfile.ZipFile("/content/drive/MyDrive/Capstone_Project/cell_images.zip", 'r')
#zip_ref.extractall("/content/drive/MyDrive/Capstone_Project/")
#zip_ref.close()
The extracted folder has different folders for train and test data will contain the different sizes of images for parasitized and uninfected cells within the respective folder name.
The size of all images must be the same and should be converted to 4D arrays so that they can be used as an input for the convolutional neural network. Also, we need to create the labels for both types of images to be able to train and test the model.
Let's do the same for the training data first and then we will use the same code for the test data as well.
# Parent directory where images are stored in drive
parent_dir = '/content/drive/MyDrive/Capstone_Project/cell_images'
# Path to the training and testing datasets within the parent directory
train_dir = os.path.join(parent_dir, 'train')
test_dir = os.path.join(parent_dir, 'test')
# Directory with our training pictures
train_parasitized_dir = os.path.join(train_dir, 'parasitized')
train_uninfected_dir = os.path.join(train_dir, 'uninfected')
# Directory with our testing pictures
test_parasitized_dir = os.path.join(test_dir, 'parasitized')
test_uninfected_dir = os.path.join(test_dir, 'uninfected')
train_parasitized_files = len(os.listdir(train_parasitized_dir))
train_uninfected_files = len(os.listdir(train_uninfected_dir))
test_parasitized_files = len(os.listdir(test_parasitized_dir))
test_uninfected_files = len(os.listdir(test_uninfected_dir))
print('Number of training images parasitized:',train_parasitized_files)
print('Number of training images uninfected:',train_uninfected_files)
print('Number of testing images parasitized:',test_parasitized_files)
print('Number of testing images uninfected:',test_uninfected_files)
Number of training images parasitized: 12582 Number of training images uninfected: 12376 Number of testing images parasitized: 1300 Number of testing images uninfected: 1300
from PIL import Image
train_images=[]
train_class=[]
test_images=[]
test_class=[]
def create_dataset(img_folder):
for dir in os.listdir(img_folder):
for file in os.listdir(os.path.join(img_folder, dir)):
image_path= os.path.join(img_folder, dir, file)
image = Image.open(image_path)
# Resizing each image to (64, 64)
# image = image.resize((64, 64))
image=np.array(image)
# Normalizing the images
# image = image.astype('float32')
# image /= 255
if (img_folder.find('train') != -1):
train_images.append(image)
if(img_folder.find('train') != -1 and dir == 'parasitized'):
train_class.append(1)
if(img_folder.find('train') != -1 and dir == 'uninfected'):
train_class.append(0)
if (img_folder.find('test') != -1):
test_images.append(image)
if(img_folder.find('test') != -1 and dir == 'parasitized'):
test_class.append(1)
if(img_folder.find('test') != -1 and dir == 'uninfected'):
test_class.append(0)
create_dataset(train_dir)
create_dataset(test_dir)
for i in range(0,5):
print('Shape of Training image ',i,' ',train_images[i].shape)
for i in range(0,5):
print('Shape of Testing image ',i,' ',test_images[i].shape)
Shape of Training image 0 (154, 160, 3) Shape of Training image 1 (148, 154, 3) Shape of Training image 2 (178, 172, 3) Shape of Training image 3 (142, 148, 3) Shape of Training image 4 (163, 175, 3) Shape of Testing image 0 (121, 112, 3) Shape of Testing image 1 (151, 157, 3) Shape of Testing image 2 (121, 148, 3) Shape of Testing image 3 (118, 118, 3) Shape of Testing image 4 (118, 100, 3)
train_images = np.array(train_images,dtype=object)
train_class = np.array(train_class)
test_images = np.array(test_images,dtype=object)
test_class = np.array(test_class)
print('Shape of train images:',train_images.shape)
print('Shape of test images:',test_images.shape)
print('Shape of train labels:',train_class.shape)
print('Shape of test labels:',test_class.shape)
Shape of train images: (24958,) Shape of test images: (2600,) Shape of train labels: (24958,) Shape of test labels: (2600,)
There are 24958 training labels and 2600 test labels.We observed that the shapes of training and test images are varying in pixel size.We will have to resize the images to the same size prior to training the model.
min_pix_red = 255
min_pix_green = 255
min_pix_blue = 255
max_pix_red = 0
max_pix_green = 0
max_pix_blue = 0
for i in range (len(train_images)):
if (np.min(train_images[i][:,:,0]) < min_pix_red):
min_pix_red = np.min(train_images[i][:,:,0])
if (np.max(train_images[i][:,:,0]) > max_pix_red):
max_pix_red = np.max(train_images[i][:,:,0])
if (np.min(train_images[i][:,:,1]) < min_pix_green):
min_pix_green = np.min(train_images[i][:,:,1])
if (np.max(train_images[i][:,:,1]) > max_pix_green):
max_pix_green = np.max(train_images[i][:,:,1])
if (np.min(train_images[i][:,:,2]) < min_pix_blue):
min_pix_blue = np.min(train_images[i][:,:,2])
if (np.max(train_images[i][:,:,2]) > max_pix_blue):
max_pix_blue = np.max(train_images[i][:,:,2])
print('For Training Images:')
print('*************')
print('Min Red Pixels:',min_pix_red)
print('Min Green Pixels:',min_pix_green)
print('Min Blue Pixels:',min_pix_blue)
print('Max Red Pixels:',max_pix_red)
print('Max Green Pixels:',max_pix_green)
print('Max Blue Pixels:',max_pix_blue)
min_pix_red = 255
min_pix_green = 255
min_pix_blue = 255
max_pix_red = 0
max_pix_green = 0
max_pix_blue = 0
for i in range (len(test_images)):
if (np.min(test_images[i][:,:,0]) < min_pix_red):
min_pix_red = np.min(test_images[i][:,:,0])
if (np.max(test_images[i][:,:,0]) > max_pix_red):
max_pix_red = np.max(test_images[i][:,:,0])
if (np.min(test_images[i][:,:,1]) < min_pix_green):
min_pix_green = np.min(test_images[i][:,:,1])
if (np.max(test_images[i][:,:,1]) > max_pix_green):
max_pix_green = np.max(test_images[i][:,:,1])
if (np.min(test_images[i][:,:,2]) < min_pix_blue):
min_pix_blue = np.min(test_images[i][:,:,2])
if (np.max(test_images[i][:,:,2]) > max_pix_blue):
max_pix_blue = np.max(test_images[i][:,:,2])
print('For Testing Images:')
print('*************')
print('Min Red Pixels:',min_pix_red)
print('Min Green Pixels:',min_pix_green)
print('Min Blue Pixels:',min_pix_blue)
print('Max Red Pixels:',max_pix_red)
print('Max Green Pixels:',max_pix_green)
print('Max Blue Pixels:',max_pix_blue)
For Training Images: ************* Min Red Pixels: 0 Min Green Pixels: 0 Min Blue Pixels: 0 Max Red Pixels: 255 Max Green Pixels: 244 Max Blue Pixels: 246 For Testing Images: ************* Min Red Pixels: 0 Min Green Pixels: 0 Min Blue Pixels: 0 Max Red Pixels: 255 Max Green Pixels: 231 Max Blue Pixels: 215
For training images the Red pixels varies from 0 to 255, Green pixels varies from 0 to 244 and Blue pixels varies from 0 to 246
For testing images the Red pixels varies from 0 to 255, Green pixels varies from 0 to 231 and Blue pixels varies from 0 to 215
Analysing the pixels shows that Red color is the most prominent for both training and testing images.
parasitized_count = 0
uninfected_count = 0
for i in range(len(train_class)):
if(train_class[i] == 1):
parasitized_count += 1
if(train_class[i] == 0):
uninfected_count += 1
print('Train Parasitized count:',parasitized_count)
print('Train Uninfected count:',uninfected_count)
parasitized_count = 0
uninfected_count = 0
for i in range(len(test_class)):
if(test_class[i] == 1):
parasitized_count += 1
if(test_class[i] == 0):
uninfected_count += 1
print('Test Parasitized count:',parasitized_count)
print('Test Uninfected count:',uninfected_count)
Train Parasitized count: 12582 Train Uninfected count: 12376 Test Parasitized count: 1300 Test Uninfected count: 1300
from PIL import Image
train_images=[]
train_class=[]
test_images=[]
test_class=[]
def create_dataset(img_folder):
for dir in os.listdir(img_folder):
for file in os.listdir(os.path.join(img_folder, dir)):
image_path= os.path.join(img_folder, dir, file)
image = Image.open(image_path)
# Resizing each image to (64, 64)
image = image.resize((64, 64))
image=np.array(image)
# Normalizing the images
image = image.astype('float32')
image /= 255
if (img_folder.find('train') != -1):
train_images.append(image)
if(img_folder.find('train') != -1 and dir == 'parasitized'):
train_class.append(1)
if(img_folder.find('train') != -1 and dir == 'uninfected'):
train_class.append(0)
if (img_folder.find('test') != -1):
test_images.append(image)
if(img_folder.find('test') != -1 and dir == 'parasitized'):
test_class.append(1)
if(img_folder.find('test') != -1 and dir == 'uninfected'):
test_class.append(0)
create_dataset(train_dir)
create_dataset(test_dir)
for i in range(0,5):
print('Shape of Training image ',i,' ',train_images[i].shape)
for i in range(0,5):
print('Shape of Testing image ',i,' ',test_images[i].shape)
Shape of Training image 0 (64, 64, 3) Shape of Training image 1 (64, 64, 3) Shape of Training image 2 (64, 64, 3) Shape of Training image 3 (64, 64, 3) Shape of Training image 4 (64, 64, 3) Shape of Testing image 0 (64, 64, 3) Shape of Testing image 1 (64, 64, 3) Shape of Testing image 2 (64, 64, 3) Shape of Testing image 3 (64, 64, 3) Shape of Testing image 4 (64, 64, 3)
# Converting training and testing images and their corresponding labels to numpy arrays.
train_images = np.array(train_images)
train_class = np.array(train_class)
test_images = np.array(test_images)
test_class = np.array(test_class)
print('Shape of train images:',train_images.shape)
print('Shape of test images:',test_images.shape)
print('Shape of train labels:',train_class.shape)
print('Shape of test labels:',test_class.shape)
Shape of train images: (24958, 64, 64, 3) Shape of test images: (2600, 64, 64, 3) Shape of train labels: (24958,) Shape of test labels: (2600,)
We have 24958 images for both infected and uninfected type for training and 2600 test images for same type for evaluating our model performance. We have resized the train and test images to 64 X 64 and changed the data type of both to float32. We have also normalized the images by dividing them with 255.
# set width of bar
barWidth = 0.25
fig = plt.figure()
# creating the bar plot
ax = fig.add_axes([0,0,1,1])
X_label = ['train_uninfected', 'train_parasitized', 'test_uninfected', 'test_parasitized']
y_label = []
uninfected_cnt = 0
parasitized_cnt = 0
test_uninfected_cnt = 0
test_parasitized_cnt = 0
for i in range(len(train_class)):
if (train_class[i] == 0):
uninfected_cnt += 1
for i in range(len(train_class)):
if (train_class[i] == 1):
parasitized_cnt += 1
for i in range(len(test_class)):
if (test_class[i] == 0):
test_uninfected_cnt += 1
for i in range(len(test_class)):
if (test_class[i] == 1):
test_parasitized_cnt += 1
y_label.append(uninfected_cnt)
y_label.append(parasitized_cnt)
y_label.append(test_uninfected_cnt)
y_label.append(test_parasitized_cnt)
# Make the plot
print(y_label)
#ax.bar(X_label,y_label)
ax.bar(X_label[0],y_label[0], color = 'g', width = 0.25)
ax.bar(X_label[1],y_label[1], color = 'r', width = 0.25)
ax.bar(X_label[2],y_label[2], color = 'g', width = 0.25)
ax.bar(X_label[3],y_label[3], color = 'r', width = 0.25)
plt.xlabel("Malaria Detection")
plt.ylabel("Count")
plt.title("Malaria Detection Training vs Testing Counts")
plt.show()
[12376, 12582, 1300, 1300]
We have 12582 training images for parasitized and 12376 images for uninfected type.The testing images have 1300 each for uninfected and parasitized. We have 206 more training images for parasitized than the uninfected images. The training data is not balanced while the test data is balanced. It is preferrable to have a balanced data for training,but the difference in the two types(206) is not too much to cause a significant reduction in accuracy.I'll train the model to evaluate the accuracy and f1-score and if there is enough variation in precision and recall will try to balance the data and see how it goes.
Let's visualize the images from the train data
fig = plt.figure(figsize=(10, 10))
for i in range(7):
ax = plt.subplot(1, 7,i+1)
ax.axis('Off')
plt.imshow(train_images[i], interpolation='nearest')
The infected cell images have a patch of dark pink color that shows the infections.The uninfected cells don't have this which is evident in the 36 image subplot in the next cell below this that shows both types.
fig = plt.figure(figsize=(12, 12))
data = []
classes = []
for i in range(12570,12606):
data.append(train_images[i])
classes.append(train_class[i])
for i in range(len(data)):
ax = plt.subplot(6, 6,i+1)
ax.axis('Off')
plt.imshow(data[i], interpolation='nearest')
if(classes[i]==1):
plt.title('parasitized')
if(classes[i]==0):
plt.title('uninfected')
plt.plot(6,6)
parasitized_images = []
uninfected_images = []
mean_inf_img = []
mean_uninf_img = []
def create_ds():
for i in range(len(train_class)):
if train_class[i] == 1:
parasitized_images.append(train_images[i])
for i in range(len(train_class)):
if train_class[i] == 0:
uninfected_images.append(train_images[i])
create_ds()
# # Function to plot the mean image
def calc_mean_image(label):
if label == 1.0:
mean_inf_img = np.mean(parasitized_images,axis=0)
plt.imshow(mean_inf_img)
plt.title('Mean Image: Parasitized')
if label == 0.0:
mean_uninf_img = np.mean(uninfected_images,axis=0)
plt.imshow(mean_uninf_img)
plt.title('Mean Image: Uninfected')
Mean image for parasitized
fig = plt.figure(figsize=(2, 2))
calc_mean_image(1.0)
Mean image for uninfected
fig = plt.figure(figsize=(2, 2))
calc_mean_image(0.0)
The mean image for parasitized and uninfected images are plotted above and we can see that there is a slight increase in the intensity of color for parasitized images compared to the uninfected images.
import cv2
# Function rgb2hsv to convert images from RGB to HSV
def rgb2hsv(images,classes):
fig = plt.figure(figsize=(12, 12))
hsv_array = []
class_array = []
for i in range(len(images)):
hsv = cv2.cvtColor(images[i], cv2.COLOR_BGR2HSV)
hsv_array.append(hsv)
class_array.append(classes[i])
for i in range(36):
ax = plt.subplot(6, 6,i+1)
ax.axis('Off')
plt.imshow((hsv_array[i] * 255).astype(np.uint8))
# plt.imshow(hsv_array[i], interpolation='nearest')
if(class_array[i]==1):
plt.title('parasitized')
if(class_array[i]==0):
plt.title('uninfected')
plt.plot(6,6)
return hsv_array,class_array
sample_train_images = []
sample_train_class = []
for i in range(12570,12606):
sample_train_images.append(train_images[i])
sample_train_class.append(train_class[i])
hsv_train_images,hsv_train_class = rgb2hsv(sample_train_images,sample_train_class)
sample_test_images = []
sample_test_class = []
for i in range(1290,1326):
sample_test_images.append(test_images[i])
sample_test_class.append(test_class[i])
hsv_test_images,hsv_test_class = rgb2hsv(sample_test_images,sample_test_class)
We can observe the infections clearly in a lighter shade of "white" patches in the hsv version of the training images.The uninfected images does not have this patch. It can be seen that converting to hsv have added more noise to the images and may not be helpful in getting accuracy for our models.
# Function to apply gaussian blur on images
def gaussian_blur(images,classes):
fig = plt.figure(figsize=(12, 12))
gbr_array = []
class_array = []
for i in range(len(images)):
gbr = cv2.GaussianBlur(images[i], (5, 5), 0)
gbr_array.append(gbr)
class_array.append(classes[i])
for i in range(36):
ax = plt.subplot(6, 6,i+1)
ax.axis('Off')
plt.imshow(gbr_array[i])
# plt.imshow(gbr_array[i], interpolation='nearest')
if(class_array[i]==1):
plt.title('parasitized')
if(class_array[i]==0):
plt.title('uninfected')
plt.plot(6,6)
return gbr_array,class_array
sample_train_images_gbr = []
sample_train_class_gbr = []
for i in range(12570,12606):
sample_train_images_gbr.append(train_images[i])
sample_train_class_gbr.append(train_class[i])
gbr_train_images,gbr_train_class = gaussian_blur(sample_train_images_gbr,sample_train_class_gbr)
sample_test_images_gbr = []
sample_test_class_gbr = []
for i in range(1290,1326):
sample_test_images_gbr.append(test_images[i])
sample_test_class_gbr.append(test_class[i])
gbr_test_images,gbr_test_class = gaussian_blur(sample_test_images_gbr,sample_test_class_gbr)
Think About It: Would blurring help us for this problem statement in any way? What else can we try?
Gaussian blurring is effective is smoothing the image and reduce noise and much detail.It is like viewing the image through a translucent screen.It also reduced the standard deviation of pixel values in the image.
I'll evaluate the model performance without using the blurring and see how it performs and later analyse the impact of using the blurred images for retraining the best CNN model.
Note: The Base Model has been fully built and evaluated with all outputs shown to give an idea about the process of the creation and evaluation of the performance of a CNN architecture. A similar process can be followed in iterating to build better-performing CNN architectures.
# For Model Building
import tensorflow as tf
import keras
from tensorflow.keras.models import Sequential, Model # Sequential api for sequential model
# Clearing backend
from tensorflow.keras import backend
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Dropout
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from random import shuffle
# For classification problem,we have already categorized the outputs as binary, i.e, either 0 or 1.
# The First model will use sigmoid activation and see results without any encoding of target variables.
train_class = train_class.astype('float32')
test_class = test_class.astype('float32')
train_class = np.array(train_class)
test_class = np.array(test_class)
print(train_class.dtype)
print(test_class.dtype)
print(train_class[0])
print(test_class[0])
float32 float32 1.0 1.0
backend.clear_session()
# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(42)
import random
random.seed(42)
tf.random.set_seed(42)
cnn_model_1 = Sequential()
cnn_model_1.add(Conv2D(64, (2,2), activation='relu', input_shape=(64, 64, 3), padding = 'same'))
cnn_model_1.add(MaxPooling2D(2,2))
cnn_model_1.add(Conv2D(32, (2,2), activation='relu', padding = 'same'))
cnn_model_1.add(MaxPooling2D(2,2))
cnn_model_1.add(Conv2D(32, (2,2), activation='relu', padding = 'same'))
cnn_model_1.add(MaxPooling2D(2,2))
cnn_model_1.add(BatchNormalization()),
cnn_model_1.add(Flatten())
cnn_model_1.add(Dense(64, activation='relu'))
cnn_model_1.add(Dense(32, activation='relu'))
cnn_model_1.add(Dense(32, activation='relu'))
cnn_model_1.add(Dense(1, activation='sigmoid'))
cnn_model_1.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 64, 64, 64) 832
max_pooling2d (MaxPooling2D (None, 32, 32, 64) 0
)
conv2d_1 (Conv2D) (None, 32, 32, 32) 8224
max_pooling2d_1 (MaxPooling (None, 16, 16, 32) 0
2D)
conv2d_2 (Conv2D) (None, 16, 16, 32) 4128
max_pooling2d_2 (MaxPooling (None, 8, 8, 32) 0
2D)
batch_normalization (BatchN (None, 8, 8, 32) 128
ormalization)
flatten (Flatten) (None, 2048) 0
dense (Dense) (None, 64) 131136
dense_1 (Dense) (None, 32) 2080
dense_2 (Dense) (None, 32) 1056
dense_3 (Dense) (None, 1) 33
=================================================================
Total params: 147,617
Trainable params: 147,553
Non-trainable params: 64
_________________________________________________________________
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
cnn_model_1.compile(loss = 'binary_crossentropy',optimizer=opt,metrics=['accuracy'])
Using Callbacks
my_callbacks = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=4)
# This callback will stop the training when there is no improvement in
# the loss for four consecutive epochs.
Fit and train our Model
history1 = cnn_model_1.fit(train_images,train_class, batch_size = 32, shuffle=True, validation_split = 0.2, epochs = 40, verbose = 1, callbacks=my_callbacks)
Epoch 1/40 624/624 [==============================] - 18s 9ms/step - loss: 0.3383 - accuracy: 0.8366 - val_loss: 0.4845 - val_accuracy: 0.8239 Epoch 2/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0889 - accuracy: 0.9692 - val_loss: 0.2722 - val_accuracy: 0.9215 Epoch 3/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0687 - accuracy: 0.9769 - val_loss: 0.7901 - val_accuracy: 0.6536 Epoch 4/40 624/624 [==============================] - 5s 7ms/step - loss: 0.0614 - accuracy: 0.9786 - val_loss: 0.0922 - val_accuracy: 0.9661 Epoch 5/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0566 - accuracy: 0.9804 - val_loss: 0.1202 - val_accuracy: 0.9567 Epoch 6/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0481 - accuracy: 0.9829 - val_loss: 0.0623 - val_accuracy: 0.9792 Epoch 7/40 624/624 [==============================] - 5s 7ms/step - loss: 0.0423 - accuracy: 0.9858 - val_loss: 0.1093 - val_accuracy: 0.9551 Epoch 8/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0400 - accuracy: 0.9857 - val_loss: 0.1057 - val_accuracy: 0.9609 Epoch 9/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0367 - accuracy: 0.9868 - val_loss: 0.3592 - val_accuracy: 0.8750 Epoch 10/40 624/624 [==============================] - 5s 7ms/step - loss: 0.0299 - accuracy: 0.9893 - val_loss: 0.1454 - val_accuracy: 0.9479 Epoch 11/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0261 - accuracy: 0.9906 - val_loss: 0.3768 - val_accuracy: 0.8904 Epoch 12/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0229 - accuracy: 0.9921 - val_loss: 0.4394 - val_accuracy: 0.8772 Epoch 13/40 624/624 [==============================] - 5s 7ms/step - loss: 0.0196 - accuracy: 0.9926 - val_loss: 0.1333 - val_accuracy: 0.9643 Epoch 14/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0180 - accuracy: 0.9938 - val_loss: 0.1523 - val_accuracy: 0.9581 Epoch 15/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0180 - accuracy: 0.9937 - val_loss: 0.1979 - val_accuracy: 0.9449 Epoch 16/40 624/624 [==============================] - 5s 7ms/step - loss: 0.0162 - accuracy: 0.9943 - val_loss: 0.0843 - val_accuracy: 0.9814 Epoch 17/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0133 - accuracy: 0.9949 - val_loss: 0.2650 - val_accuracy: 0.9351 Epoch 18/40 624/624 [==============================] - 5s 7ms/step - loss: 0.0135 - accuracy: 0.9953 - val_loss: 0.1919 - val_accuracy: 0.9579 Epoch 19/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0121 - accuracy: 0.9958 - val_loss: 0.1774 - val_accuracy: 0.9579 Epoch 20/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0125 - accuracy: 0.9964 - val_loss: 0.2603 - val_accuracy: 0.9331 Epoch 21/40 624/624 [==============================] - 5s 7ms/step - loss: 0.0094 - accuracy: 0.9967 - val_loss: 0.3920 - val_accuracy: 0.9069 Epoch 22/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0080 - accuracy: 0.9973 - val_loss: 0.2291 - val_accuracy: 0.9507 Epoch 23/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0099 - accuracy: 0.9965 - val_loss: 0.1714 - val_accuracy: 0.9601 Epoch 24/40 624/624 [==============================] - 5s 7ms/step - loss: 0.0078 - accuracy: 0.9976 - val_loss: 0.3693 - val_accuracy: 0.9125 Epoch 25/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0055 - accuracy: 0.9986 - val_loss: 0.1787 - val_accuracy: 0.9643 Epoch 26/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0082 - accuracy: 0.9974 - val_loss: 0.2046 - val_accuracy: 0.9575 Epoch 27/40 624/624 [==============================] - 5s 7ms/step - loss: 0.0108 - accuracy: 0.9964 - val_loss: 0.2560 - val_accuracy: 0.9367 Epoch 28/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0051 - accuracy: 0.9984 - val_loss: 0.4412 - val_accuracy: 0.8942 Epoch 29/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0067 - accuracy: 0.9979 - val_loss: 0.2062 - val_accuracy: 0.9543 Epoch 30/40 624/624 [==============================] - 5s 7ms/step - loss: 0.0063 - accuracy: 0.9981 - val_loss: 0.3642 - val_accuracy: 0.9341 Epoch 31/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0077 - accuracy: 0.9974 - val_loss: 0.5273 - val_accuracy: 0.9016 Epoch 32/40 624/624 [==============================] - 5s 8ms/step - loss: 0.0050 - accuracy: 0.9984 - val_loss: 0.1742 - val_accuracy: 0.9688 Epoch 33/40 624/624 [==============================] - 5s 7ms/step - loss: 0.0080 - accuracy: 0.9972 - val_loss: 0.1291 - val_accuracy: 0.9730 Epoch 34/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0067 - accuracy: 0.9981 - val_loss: 0.4720 - val_accuracy: 0.8964 Epoch 35/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0033 - accuracy: 0.9989 - val_loss: 0.2745 - val_accuracy: 0.9557 Epoch 36/40 624/624 [==============================] - 5s 7ms/step - loss: 0.0079 - accuracy: 0.9975 - val_loss: 0.2676 - val_accuracy: 0.9429 Epoch 37/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0035 - accuracy: 0.9988 - val_loss: 0.1748 - val_accuracy: 0.9708 Epoch 38/40 624/624 [==============================] - 5s 7ms/step - loss: 0.0015 - accuracy: 0.9996 - val_loss: 0.2634 - val_accuracy: 0.9549 Epoch 39/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0069 - accuracy: 0.9975 - val_loss: 0.2282 - val_accuracy: 0.9637 Epoch 40/40 624/624 [==============================] - 4s 7ms/step - loss: 0.0073 - accuracy: 0.9981 - val_loss: 0.2840 - val_accuracy: 0.9469
accuracy = cnn_model_1.evaluate(test_images,test_class)
print(accuracy)
print('Accuracy of model on Test Data: ','{:.2f}%'.format(accuracy[1]*100))
82/82 [==============================] - 1s 5ms/step - loss: 0.1526 - accuracy: 0.9777 [0.15260440111160278, 0.9776923060417175] Accuracy of model on Test Data: 97.77%
Plotting the confusion matrix
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
# Function to get the predictions from Sigmoid using a threshold of :
# > 0.7 for true classification and
# < 0.7 for false classification
def prediction(images):
pred_class = []
predictions = cnn_model_1.predict(images)
for i, predicted in enumerate(predictions):
if predictions[i] > 0.7:
pred_class.append(1.0)
else:
pred_class.append(0.0)
return pred_class
pred_class = prediction(test_images)
print('Prediction first test image:',pred_class[0])
print('Actual Class of first image:',test_class[0])
# Printing the classification report
print(classification_report(test_class, pred_class))
# Plotting the heatmap using confusion matrix
cm = confusion_matrix(test_class, pred_class)
plt.figure(figsize = (8, 5))
sns.heatmap(cm, annot = True, fmt = '.0f', xticklabels = ['Uninfected', 'Parasitized'], yticklabels = ['Uninfected', 'Parasitized'])
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()
82/82 [==============================] - 0s 2ms/step
Prediction first test image: 1.0
Actual Class of first image: 1.0
precision recall f1-score support
0.0 0.99 0.97 0.98 1300
1.0 0.97 0.99 0.98 1300
accuracy 0.98 2600
macro avg 0.98 0.98 0.98 2600
weighted avg 0.98 0.98 0.98 2600
Plotting the train and validation curves
# Function to plot train and validation accuracy
plt.plot(history1.history['accuracy'])
plt.plot(history1.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
So now let's try to build another model with few more add on layers and try to check if we can try to improve the model. Therefore try to build a model by adding few layers if required and altering the activation functions.
backend.clear_session()
# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(42)
import random
random.seed(42)
tf.random.set_seed(42)
# For the second CNN model,I want to one hot encode the training and testing labels
train_class_encoded = to_categorical(train_class,2)
test_class_encoded = to_categorical(test_class,2)
# For the second CNN model,I've added Dropout layers to prevent the model from overfitting the training data
# The models output layer uses softmax activation instead of sigmoid
cnn_model_2 = Sequential()
cnn_model_2.add(Conv2D(64, (2,2), activation='relu', input_shape=(64, 64, 3), padding = 'same'))
cnn_model_2.add(MaxPooling2D(2,2))
cnn_model_2.add(Conv2D(32, (2,2), activation='relu', padding = 'same'))
cnn_model_2.add(MaxPooling2D(2,2))
cnn_model_2.add(Conv2D(32, (2,2), activation='relu', padding = 'same'))
cnn_model_2.add(MaxPooling2D(2,2))
cnn_model_2.add(Conv2D(32, (2,2), activation='relu', padding = 'same'))
cnn_model_2.add(MaxPooling2D(2,2))
cnn_model_2.add(Dropout(0.2))
cnn_model_2.add(Flatten())
cnn_model_2.add(Dense(64, activation='relu'))
cnn_model_2.add(Dropout(0.2))
cnn_model_2.add(Dense(32, activation='relu'))
cnn_model_2.add(Dense(32, activation='relu'))
cnn_model_2.add(Dense(2, activation='softmax'))
cnn_model_2.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 64, 64, 64) 832
max_pooling2d (MaxPooling2D (None, 32, 32, 64) 0
)
conv2d_1 (Conv2D) (None, 32, 32, 32) 8224
max_pooling2d_1 (MaxPooling (None, 16, 16, 32) 0
2D)
conv2d_2 (Conv2D) (None, 16, 16, 32) 4128
max_pooling2d_2 (MaxPooling (None, 8, 8, 32) 0
2D)
conv2d_3 (Conv2D) (None, 8, 8, 32) 4128
max_pooling2d_3 (MaxPooling (None, 4, 4, 32) 0
2D)
dropout (Dropout) (None, 4, 4, 32) 0
flatten (Flatten) (None, 512) 0
dense (Dense) (None, 64) 32832
dropout_1 (Dropout) (None, 64) 0
dense_1 (Dense) (None, 32) 2080
dense_2 (Dense) (None, 32) 1056
dense_3 (Dense) (None, 2) 66
=================================================================
Total params: 53,346
Trainable params: 53,346
Non-trainable params: 0
_________________________________________________________________
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
cnn_model_2.compile(loss = 'categorical_crossentropy',optimizer=opt,metrics= "accuracy")
Using Callbacks
my_callbacks = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=10)
# This callback will stop the training when there is no improvement in
# the loss for four consecutive epochs.
Fit and Train the model
history2 = cnn_model_2.fit(train_images,train_class_encoded, batch_size = 64, shuffle=True, validation_split = 0.2, epochs = 100, verbose = 1, callbacks=my_callbacks)
Epoch 1/100 312/312 [==============================] - 9s 17ms/step - loss: 0.3686 - accuracy: 0.8160 - val_loss: 0.1479 - val_accuracy: 0.9629 Epoch 2/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0635 - accuracy: 0.9783 - val_loss: 0.0929 - val_accuracy: 0.9675 Epoch 3/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0555 - accuracy: 0.9808 - val_loss: 0.0625 - val_accuracy: 0.9736 Epoch 4/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0509 - accuracy: 0.9815 - val_loss: 0.1179 - val_accuracy: 0.9537 Epoch 5/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0482 - accuracy: 0.9824 - val_loss: 0.0943 - val_accuracy: 0.9619 Epoch 6/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0482 - accuracy: 0.9819 - val_loss: 0.2142 - val_accuracy: 0.9058 Epoch 7/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0475 - accuracy: 0.9830 - val_loss: 0.1185 - val_accuracy: 0.9569 Epoch 8/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0458 - accuracy: 0.9830 - val_loss: 0.0908 - val_accuracy: 0.9653 Epoch 9/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0446 - accuracy: 0.9829 - val_loss: 0.0809 - val_accuracy: 0.9685 Epoch 10/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0430 - accuracy: 0.9835 - val_loss: 0.1149 - val_accuracy: 0.9529 Epoch 11/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0415 - accuracy: 0.9839 - val_loss: 0.1511 - val_accuracy: 0.9353 Epoch 12/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0411 - accuracy: 0.9844 - val_loss: 0.1004 - val_accuracy: 0.9601 Epoch 13/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0411 - accuracy: 0.9848 - val_loss: 0.0639 - val_accuracy: 0.9760 Epoch 14/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0423 - accuracy: 0.9836 - val_loss: 0.2006 - val_accuracy: 0.9227 Epoch 15/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0379 - accuracy: 0.9858 - val_loss: 0.0807 - val_accuracy: 0.9708 Epoch 16/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0398 - accuracy: 0.9848 - val_loss: 0.1638 - val_accuracy: 0.9349 Epoch 17/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0378 - accuracy: 0.9855 - val_loss: 0.0703 - val_accuracy: 0.9748 Epoch 18/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0388 - accuracy: 0.9855 - val_loss: 0.0607 - val_accuracy: 0.9790 Epoch 19/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0365 - accuracy: 0.9861 - val_loss: 0.1763 - val_accuracy: 0.9375 Epoch 20/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0346 - accuracy: 0.9868 - val_loss: 0.1291 - val_accuracy: 0.9471 Epoch 21/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0348 - accuracy: 0.9866 - val_loss: 0.1479 - val_accuracy: 0.9469 Epoch 22/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0335 - accuracy: 0.9869 - val_loss: 0.1772 - val_accuracy: 0.9285 Epoch 23/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0331 - accuracy: 0.9873 - val_loss: 0.1842 - val_accuracy: 0.9301 Epoch 24/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0318 - accuracy: 0.9866 - val_loss: 0.1106 - val_accuracy: 0.9605 Epoch 25/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0324 - accuracy: 0.9873 - val_loss: 0.1501 - val_accuracy: 0.9421 Epoch 26/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0317 - accuracy: 0.9876 - val_loss: 0.1253 - val_accuracy: 0.9551 Epoch 27/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0301 - accuracy: 0.9880 - val_loss: 0.2060 - val_accuracy: 0.9223 Epoch 28/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0300 - accuracy: 0.9882 - val_loss: 0.1705 - val_accuracy: 0.9433 Epoch 29/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0297 - accuracy: 0.9884 - val_loss: 0.1415 - val_accuracy: 0.9511 Epoch 30/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0267 - accuracy: 0.9895 - val_loss: 0.1148 - val_accuracy: 0.9631 Epoch 31/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0300 - accuracy: 0.9884 - val_loss: 0.1294 - val_accuracy: 0.9511 Epoch 32/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0273 - accuracy: 0.9886 - val_loss: 0.1477 - val_accuracy: 0.9423 Epoch 33/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0269 - accuracy: 0.9895 - val_loss: 0.2098 - val_accuracy: 0.9277 Epoch 34/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0234 - accuracy: 0.9911 - val_loss: 0.1935 - val_accuracy: 0.9389 Epoch 35/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0271 - accuracy: 0.9895 - val_loss: 0.1754 - val_accuracy: 0.9379 Epoch 36/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0238 - accuracy: 0.9908 - val_loss: 0.1598 - val_accuracy: 0.9455 Epoch 37/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0233 - accuracy: 0.9912 - val_loss: 0.1083 - val_accuracy: 0.9641 Epoch 38/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0228 - accuracy: 0.9917 - val_loss: 0.1586 - val_accuracy: 0.9537 Epoch 39/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0212 - accuracy: 0.9920 - val_loss: 0.1823 - val_accuracy: 0.9439 Epoch 40/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0219 - accuracy: 0.9909 - val_loss: 0.1058 - val_accuracy: 0.9694 Epoch 41/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0211 - accuracy: 0.9914 - val_loss: 0.1084 - val_accuracy: 0.9653 Epoch 42/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0216 - accuracy: 0.9913 - val_loss: 0.2683 - val_accuracy: 0.9143 Epoch 43/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0202 - accuracy: 0.9923 - val_loss: 0.2311 - val_accuracy: 0.9417 Epoch 44/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0211 - accuracy: 0.9925 - val_loss: 0.2876 - val_accuracy: 0.9187 Epoch 45/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0179 - accuracy: 0.9931 - val_loss: 0.1735 - val_accuracy: 0.9535 Epoch 46/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0200 - accuracy: 0.9920 - val_loss: 0.1354 - val_accuracy: 0.9615 Epoch 47/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0170 - accuracy: 0.9933 - val_loss: 0.1957 - val_accuracy: 0.9497 Epoch 48/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0177 - accuracy: 0.9932 - val_loss: 0.2365 - val_accuracy: 0.9407 Epoch 49/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0188 - accuracy: 0.9928 - val_loss: 0.2869 - val_accuracy: 0.9249 Epoch 50/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0161 - accuracy: 0.9940 - val_loss: 0.3315 - val_accuracy: 0.9253 Epoch 51/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0183 - accuracy: 0.9930 - val_loss: 0.1881 - val_accuracy: 0.9577 Epoch 52/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0156 - accuracy: 0.9936 - val_loss: 0.2268 - val_accuracy: 0.9485 Epoch 53/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0145 - accuracy: 0.9941 - val_loss: 0.2537 - val_accuracy: 0.9499 Epoch 54/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0149 - accuracy: 0.9941 - val_loss: 0.1548 - val_accuracy: 0.9569 Epoch 55/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0163 - accuracy: 0.9944 - val_loss: 0.2583 - val_accuracy: 0.9441 Epoch 56/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0195 - accuracy: 0.9933 - val_loss: 0.1513 - val_accuracy: 0.9617 Epoch 57/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0132 - accuracy: 0.9943 - val_loss: 0.2910 - val_accuracy: 0.9395 Epoch 58/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0110 - accuracy: 0.9960 - val_loss: 0.3653 - val_accuracy: 0.9371 Epoch 59/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0143 - accuracy: 0.9949 - val_loss: 0.2549 - val_accuracy: 0.9523 Epoch 60/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0124 - accuracy: 0.9954 - val_loss: 0.4051 - val_accuracy: 0.9217 Epoch 61/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0131 - accuracy: 0.9954 - val_loss: 0.4040 - val_accuracy: 0.9243 Epoch 62/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0147 - accuracy: 0.9945 - val_loss: 0.2301 - val_accuracy: 0.9571 Epoch 63/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0091 - accuracy: 0.9966 - val_loss: 0.2277 - val_accuracy: 0.9549 Epoch 64/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0096 - accuracy: 0.9961 - val_loss: 0.3661 - val_accuracy: 0.9353 Epoch 65/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0117 - accuracy: 0.9955 - val_loss: 0.1601 - val_accuracy: 0.9696 Epoch 66/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0127 - accuracy: 0.9953 - val_loss: 0.2783 - val_accuracy: 0.9443 Epoch 67/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0103 - accuracy: 0.9966 - val_loss: 0.3605 - val_accuracy: 0.9297 Epoch 68/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0118 - accuracy: 0.9955 - val_loss: 0.3968 - val_accuracy: 0.9331 Epoch 69/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0093 - accuracy: 0.9964 - val_loss: 0.3073 - val_accuracy: 0.9409 Epoch 70/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0101 - accuracy: 0.9966 - val_loss: 0.3259 - val_accuracy: 0.9457 Epoch 71/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0078 - accuracy: 0.9971 - val_loss: 0.2877 - val_accuracy: 0.9537 Epoch 72/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0086 - accuracy: 0.9969 - val_loss: 0.3979 - val_accuracy: 0.9379 Epoch 73/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0087 - accuracy: 0.9967 - val_loss: 0.3912 - val_accuracy: 0.9347 Epoch 74/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0117 - accuracy: 0.9957 - val_loss: 0.3251 - val_accuracy: 0.9367 Epoch 75/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0115 - accuracy: 0.9961 - val_loss: 0.2409 - val_accuracy: 0.9485 Epoch 76/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0089 - accuracy: 0.9968 - val_loss: 0.2681 - val_accuracy: 0.9509 Epoch 77/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0110 - accuracy: 0.9961 - val_loss: 0.6215 - val_accuracy: 0.8958 Epoch 78/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0087 - accuracy: 0.9971 - val_loss: 0.2988 - val_accuracy: 0.9477 Epoch 79/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0091 - accuracy: 0.9968 - val_loss: 0.2443 - val_accuracy: 0.9553 Epoch 80/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0083 - accuracy: 0.9969 - val_loss: 0.1849 - val_accuracy: 0.9685 Epoch 81/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0086 - accuracy: 0.9966 - val_loss: 0.3241 - val_accuracy: 0.9435
# Evaluate the 2nd model's performance on the test data.
accuracy_model_2 = cnn_model_2.evaluate(test_images,test_class_encoded)
print(accuracy_model_2)
print('Accuracy of model # 2 on Test Data: ','{:.2f}%'.format(accuracy_model_2[1]*100))
82/82 [==============================] - 1s 4ms/step - loss: 0.1078 - accuracy: 0.9850 [0.10781548172235489, 0.9850000143051147] Accuracy of model # 2 on Test Data: 98.50%
Plotting the confusion matrix
pred_class = cnn_model_2.predict(test_images)
pred_class = np.argmax(pred_class,axis=1)
true_class = np.argmax(test_class_encoded,axis=1)
print(classification_report(true_class, pred_class))
# Plotting the heatmap using confusion matrix
cm = confusion_matrix(true_class, pred_class)
plt.figure(figsize = (8, 5))
sns.heatmap(cm, annot = True, fmt = '.0f', xticklabels = ['Uninfected', 'Parasitized'], yticklabels = ['Uninfected', 'Parasitized'])
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()
82/82 [==============================] - 0s 2ms/step
precision recall f1-score support
0 0.99 0.98 0.98 1300
1 0.98 0.99 0.99 1300
accuracy 0.98 2600
macro avg 0.99 0.99 0.98 2600
weighted avg 0.99 0.98 0.98 2600
Analyse misclassified Parasitized predictions
Before going furthur,I want to analyse the misclassified infections and uninfections.Let's understand fom the data to find out what we are not able to learn
Parasitized_error = []
Uninfected_error = []
for i in range(len(true_class)):
if(true_class[i]==1 and (true_class[i] != pred_class[i])):
Parasitized_error.append(test_images[i])
if(true_class[i]==0 and (true_class[i] != pred_class[i])):
Uninfected_error.append(test_images[i])
fig = plt.figure(figsize=(10, 10))
for i in range(len(Parasitized_error)):
ax = plt.subplot(3, 5,i+1)
ax.axis('Off')
plt.title('Parasitized_error')
plt.imshow(Parasitized_error[i])
print('Parasitized predicted as Uninfected count:',len(Parasitized_error))
Parasitized predicted as Uninfected count: 15
Analyse misclassified Uninfected images
# Analyse Uninfected predicted as parasitized
fig = plt.figure(figsize=(12, 10))
print('Uninfected predicted as parasitized count:',len(Uninfected_error))
for i in range(len(Uninfected_error)):
ax = plt.subplot(6, 4,i+1)
ax.axis('Off')
plt.title('Uninf_error')
plt.imshow(Uninfected_error[i])
Uninfected predicted as parasitized count: 24
****Observation from the missclassifications****
It can be seen that for the missclassified parasitized images,the parasite is not quite visible/hiding on the edges.We need to capture more of such images to improve the learning process.
In the case of missclassified uninfections,the images does show the parasite infections on the edges and we need to make sure if it's really true that the cells are not parasitized.Is the data correct or were there any mistakes in the sampling process.If not, there is more information we need to clearly classify the images.
Plotting the train and the validation curves
# Plot train and validation accuracy
plt.plot(history2.history['accuracy'])
plt.plot(history2.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
Now let's build a model with LeakyRelu as the activation function
Let us try to build a model using BatchNormalization and using LeakyRelu as our activation function.
backend.clear_session()
# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(42)
import random
random.seed(42)
tf.random.set_seed(42)
from keras.layers import LeakyReLU
cnn_model_3 = Sequential()
cnn_model_3.add(Conv2D(32, (2,2), input_shape=(64, 64, 3), activation=LeakyReLU(alpha=0.01), padding = 'same'))
cnn_model_3.add(MaxPooling2D(2,2))
cnn_model_3.add(Dropout(0.2))
cnn_model_3.add(Conv2D(32, (2,2), activation=LeakyReLU(alpha=0.01), padding = 'same'))
cnn_model_3.add(MaxPooling2D(2,2))
cnn_model_3.add(Conv2D(32, (2,2), activation=LeakyReLU(alpha=0.1), padding = 'same'))
cnn_model_3.add(MaxPooling2D(2,2))
cnn_model_3.add(Conv2D(32, (2,2), activation=LeakyReLU(alpha=0.1), padding = 'same'))
cnn_model_3.add(MaxPooling2D(2,2))
cnn_model_3.add(Dropout(0.2))
cnn_model_3.add(BatchNormalization()),
cnn_model_3.add(Flatten())
cnn_model_3.add(Dense(128, activation=LeakyReLU(alpha=0.01)))
cnn_model_3.add(Dropout(0.2))
cnn_model_3.add(Dense(64, activation=LeakyReLU(alpha=0.01)))
cnn_model_3.add(Dropout(0.2))
cnn_model_3.add(Dense(64, activation=LeakyReLU(alpha=0.01)))
cnn_model_3.add(Dense(32, activation=LeakyReLU(alpha=0.01)))
cnn_model_3.add(Dense(2, activation='softmax'))
cnn_model_3.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 64, 64, 32) 416
max_pooling2d (MaxPooling2D (None, 32, 32, 32) 0
)
dropout (Dropout) (None, 32, 32, 32) 0
conv2d_1 (Conv2D) (None, 32, 32, 32) 4128
max_pooling2d_1 (MaxPooling (None, 16, 16, 32) 0
2D)
conv2d_2 (Conv2D) (None, 16, 16, 32) 4128
max_pooling2d_2 (MaxPooling (None, 8, 8, 32) 0
2D)
conv2d_3 (Conv2D) (None, 8, 8, 32) 4128
max_pooling2d_3 (MaxPooling (None, 4, 4, 32) 0
2D)
dropout_1 (Dropout) (None, 4, 4, 32) 0
batch_normalization (BatchN (None, 4, 4, 32) 128
ormalization)
flatten (Flatten) (None, 512) 0
dense (Dense) (None, 128) 65664
dropout_2 (Dropout) (None, 128) 0
dense_1 (Dense) (None, 64) 8256
dropout_3 (Dropout) (None, 64) 0
dense_2 (Dense) (None, 64) 4160
dense_3 (Dense) (None, 32) 2080
dense_4 (Dense) (None, 2) 66
=================================================================
Total params: 93,154
Trainable params: 93,090
Non-trainable params: 64
_________________________________________________________________
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
cnn_model_3.compile(loss = 'categorical_crossentropy',optimizer=opt,metrics= "accuracy")
Using callbacks
my_callbacks = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=10)
Fit and train the model
history3 = cnn_model_3.fit(train_images,train_class_encoded, batch_size = 32, shuffle=True, validation_split = 0.2, epochs = 100, verbose = 1, callbacks=my_callbacks)
Epoch 1/100 624/624 [==============================] - 9s 9ms/step - loss: 0.3417 - accuracy: 0.8293 - val_loss: 0.1094 - val_accuracy: 0.9497 Epoch 2/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0873 - accuracy: 0.9717 - val_loss: 0.1004 - val_accuracy: 0.9710 Epoch 3/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0761 - accuracy: 0.9750 - val_loss: 0.1526 - val_accuracy: 0.9365 Epoch 4/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0699 - accuracy: 0.9784 - val_loss: 0.1106 - val_accuracy: 0.9509 Epoch 5/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0653 - accuracy: 0.9786 - val_loss: 0.0678 - val_accuracy: 0.9677 Epoch 6/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0622 - accuracy: 0.9791 - val_loss: 0.1022 - val_accuracy: 0.9561 Epoch 7/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0587 - accuracy: 0.9795 - val_loss: 0.0712 - val_accuracy: 0.9683 Epoch 8/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0563 - accuracy: 0.9805 - val_loss: 0.0681 - val_accuracy: 0.9706 Epoch 9/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0573 - accuracy: 0.9798 - val_loss: 0.0756 - val_accuracy: 0.9698 Epoch 10/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0577 - accuracy: 0.9797 - val_loss: 0.0612 - val_accuracy: 0.9744 Epoch 11/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0537 - accuracy: 0.9813 - val_loss: 0.0848 - val_accuracy: 0.9635 Epoch 12/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0523 - accuracy: 0.9816 - val_loss: 0.0629 - val_accuracy: 0.9718 Epoch 13/100 624/624 [==============================] - 5s 9ms/step - loss: 0.0519 - accuracy: 0.9814 - val_loss: 0.0578 - val_accuracy: 0.9758 Epoch 14/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0509 - accuracy: 0.9813 - val_loss: 0.1224 - val_accuracy: 0.9475 Epoch 15/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0460 - accuracy: 0.9824 - val_loss: 0.0656 - val_accuracy: 0.9740 Epoch 16/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0503 - accuracy: 0.9824 - val_loss: 0.0837 - val_accuracy: 0.9673 Epoch 17/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0482 - accuracy: 0.9809 - val_loss: 0.0508 - val_accuracy: 0.9806 Epoch 18/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0442 - accuracy: 0.9832 - val_loss: 0.0476 - val_accuracy: 0.9824 Epoch 19/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0433 - accuracy: 0.9837 - val_loss: 0.0669 - val_accuracy: 0.9758 Epoch 20/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0425 - accuracy: 0.9841 - val_loss: 0.0667 - val_accuracy: 0.9734 Epoch 21/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0439 - accuracy: 0.9841 - val_loss: 0.1707 - val_accuracy: 0.9141 Epoch 22/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0422 - accuracy: 0.9849 - val_loss: 0.1010 - val_accuracy: 0.9611 Epoch 23/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0411 - accuracy: 0.9849 - val_loss: 0.0915 - val_accuracy: 0.9635 Epoch 24/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0427 - accuracy: 0.9846 - val_loss: 0.0742 - val_accuracy: 0.9718 Epoch 25/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0396 - accuracy: 0.9845 - val_loss: 0.0545 - val_accuracy: 0.9794 Epoch 26/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0362 - accuracy: 0.9856 - val_loss: 0.1120 - val_accuracy: 0.9569 Epoch 27/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0399 - accuracy: 0.9844 - val_loss: 0.1260 - val_accuracy: 0.9491 Epoch 28/100 624/624 [==============================] - 5s 9ms/step - loss: 0.0359 - accuracy: 0.9862 - val_loss: 0.1222 - val_accuracy: 0.9525 Epoch 29/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0375 - accuracy: 0.9858 - val_loss: 0.0771 - val_accuracy: 0.9752 Epoch 30/100 624/624 [==============================] - 6s 9ms/step - loss: 0.0357 - accuracy: 0.9865 - val_loss: 0.2179 - val_accuracy: 0.9077 Epoch 31/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0336 - accuracy: 0.9877 - val_loss: 0.0874 - val_accuracy: 0.9685 Epoch 32/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0346 - accuracy: 0.9871 - val_loss: 0.0472 - val_accuracy: 0.9822 Epoch 33/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0326 - accuracy: 0.9874 - val_loss: 0.0530 - val_accuracy: 0.9838 Epoch 34/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0332 - accuracy: 0.9871 - val_loss: 0.1012 - val_accuracy: 0.9641 Epoch 35/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0336 - accuracy: 0.9871 - val_loss: 0.0617 - val_accuracy: 0.9822 Epoch 36/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0335 - accuracy: 0.9879 - val_loss: 0.0971 - val_accuracy: 0.9696 Epoch 37/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0308 - accuracy: 0.9882 - val_loss: 0.0702 - val_accuracy: 0.9772 Epoch 38/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0298 - accuracy: 0.9887 - val_loss: 0.1274 - val_accuracy: 0.9551 Epoch 39/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0296 - accuracy: 0.9889 - val_loss: 0.1123 - val_accuracy: 0.9712 Epoch 40/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0292 - accuracy: 0.9889 - val_loss: 0.0615 - val_accuracy: 0.9824 Epoch 41/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0323 - accuracy: 0.9868 - val_loss: 0.0658 - val_accuracy: 0.9802 Epoch 42/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0276 - accuracy: 0.9890 - val_loss: 0.0858 - val_accuracy: 0.9762 Epoch 43/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0299 - accuracy: 0.9887 - val_loss: 0.1246 - val_accuracy: 0.9649 Epoch 44/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0237 - accuracy: 0.9908 - val_loss: 0.0800 - val_accuracy: 0.9828 Epoch 45/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0254 - accuracy: 0.9902 - val_loss: 0.1160 - val_accuracy: 0.9690 Epoch 46/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0262 - accuracy: 0.9904 - val_loss: 0.0895 - val_accuracy: 0.9776 Epoch 47/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0283 - accuracy: 0.9901 - val_loss: 0.0899 - val_accuracy: 0.9762 Epoch 48/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0221 - accuracy: 0.9915 - val_loss: 0.0832 - val_accuracy: 0.9788 Epoch 49/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0250 - accuracy: 0.9905 - val_loss: 0.0652 - val_accuracy: 0.9796 Epoch 50/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0236 - accuracy: 0.9914 - val_loss: 0.1154 - val_accuracy: 0.9786 Epoch 51/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0236 - accuracy: 0.9904 - val_loss: 0.1404 - val_accuracy: 0.9706 Epoch 52/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0225 - accuracy: 0.9922 - val_loss: 0.0718 - val_accuracy: 0.9834 Epoch 53/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0213 - accuracy: 0.9923 - val_loss: 0.1389 - val_accuracy: 0.9535 Epoch 54/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0240 - accuracy: 0.9906 - val_loss: 0.1409 - val_accuracy: 0.9607 Epoch 55/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0254 - accuracy: 0.9904 - val_loss: 0.1049 - val_accuracy: 0.9762 Epoch 56/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0239 - accuracy: 0.9916 - val_loss: 0.0564 - val_accuracy: 0.9856 Epoch 57/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0192 - accuracy: 0.9929 - val_loss: 0.0863 - val_accuracy: 0.9780 Epoch 58/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0198 - accuracy: 0.9924 - val_loss: 0.1008 - val_accuracy: 0.9768 Epoch 59/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0204 - accuracy: 0.9927 - val_loss: 0.1674 - val_accuracy: 0.9543 Epoch 60/100 624/624 [==============================] - 5s 9ms/step - loss: 0.0201 - accuracy: 0.9930 - val_loss: 0.0553 - val_accuracy: 0.9898 Epoch 61/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0197 - accuracy: 0.9917 - val_loss: 0.0833 - val_accuracy: 0.9816 Epoch 62/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0253 - accuracy: 0.9921 - val_loss: 0.1444 - val_accuracy: 0.9653 Epoch 63/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0208 - accuracy: 0.9923 - val_loss: 0.1166 - val_accuracy: 0.9732 Epoch 64/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0199 - accuracy: 0.9933 - val_loss: 0.1258 - val_accuracy: 0.9790 Epoch 65/100 624/624 [==============================] - 5s 9ms/step - loss: 0.0164 - accuracy: 0.9942 - val_loss: 0.0776 - val_accuracy: 0.9882 Epoch 66/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0179 - accuracy: 0.9935 - val_loss: 0.1073 - val_accuracy: 0.9788 Epoch 67/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0176 - accuracy: 0.9933 - val_loss: 0.1734 - val_accuracy: 0.9762 Epoch 68/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0180 - accuracy: 0.9934 - val_loss: 0.1890 - val_accuracy: 0.9675 Epoch 69/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0170 - accuracy: 0.9938 - val_loss: 0.1055 - val_accuracy: 0.9814 Epoch 70/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0184 - accuracy: 0.9934 - val_loss: 0.1789 - val_accuracy: 0.9665 Epoch 71/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0212 - accuracy: 0.9928 - val_loss: 0.1374 - val_accuracy: 0.9776 Epoch 72/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0146 - accuracy: 0.9955 - val_loss: 0.1225 - val_accuracy: 0.9828 Epoch 73/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0165 - accuracy: 0.9947 - val_loss: 0.1765 - val_accuracy: 0.9758 Epoch 74/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0177 - accuracy: 0.9943 - val_loss: 0.0933 - val_accuracy: 0.9846 Epoch 75/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0135 - accuracy: 0.9950 - val_loss: 0.1051 - val_accuracy: 0.9780 Epoch 76/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0151 - accuracy: 0.9948 - val_loss: 0.1145 - val_accuracy: 0.9770 Epoch 77/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0124 - accuracy: 0.9956 - val_loss: 0.2154 - val_accuracy: 0.9694 Epoch 78/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0157 - accuracy: 0.9942 - val_loss: 0.1285 - val_accuracy: 0.9794 Epoch 79/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0149 - accuracy: 0.9952 - val_loss: 0.1090 - val_accuracy: 0.9778 Epoch 80/100 624/624 [==============================] - 5s 9ms/step - loss: 0.0144 - accuracy: 0.9949 - val_loss: 0.1485 - val_accuracy: 0.9559 Epoch 81/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0147 - accuracy: 0.9951 - val_loss: 0.1252 - val_accuracy: 0.9758 Epoch 82/100 624/624 [==============================] - 5s 9ms/step - loss: 0.0136 - accuracy: 0.9954 - val_loss: 0.1450 - val_accuracy: 0.9842 Epoch 83/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0142 - accuracy: 0.9955 - val_loss: 0.1433 - val_accuracy: 0.9790 Epoch 84/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0130 - accuracy: 0.9954 - val_loss: 0.1773 - val_accuracy: 0.9764 Epoch 85/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0140 - accuracy: 0.9956 - val_loss: 0.1731 - val_accuracy: 0.9647 Epoch 86/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0118 - accuracy: 0.9957 - val_loss: 0.2295 - val_accuracy: 0.9692 Epoch 87/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0163 - accuracy: 0.9951 - val_loss: 0.1004 - val_accuracy: 0.9802 Epoch 88/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0125 - accuracy: 0.9952 - val_loss: 0.1103 - val_accuracy: 0.9788 Epoch 89/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0167 - accuracy: 0.9944 - val_loss: 0.0975 - val_accuracy: 0.9808 Epoch 90/100 624/624 [==============================] - 5s 9ms/step - loss: 0.0127 - accuracy: 0.9955 - val_loss: 0.0891 - val_accuracy: 0.9870 Epoch 91/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0123 - accuracy: 0.9960 - val_loss: 0.1347 - val_accuracy: 0.9776 Epoch 92/100 624/624 [==============================] - 5s 9ms/step - loss: 0.0106 - accuracy: 0.9961 - val_loss: 0.1294 - val_accuracy: 0.9816 Epoch 93/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0126 - accuracy: 0.9957 - val_loss: 0.1166 - val_accuracy: 0.9852 Epoch 94/100 624/624 [==============================] - 5s 9ms/step - loss: 0.0120 - accuracy: 0.9958 - val_loss: 0.1075 - val_accuracy: 0.9816 Epoch 95/100 624/624 [==============================] - 5s 9ms/step - loss: 0.0115 - accuracy: 0.9967 - val_loss: 0.1142 - val_accuracy: 0.9764 Epoch 96/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0103 - accuracy: 0.9966 - val_loss: 0.1915 - val_accuracy: 0.9641 Epoch 97/100 624/624 [==============================] - 5s 9ms/step - loss: 0.0141 - accuracy: 0.9957 - val_loss: 0.1662 - val_accuracy: 0.9786 Epoch 98/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0108 - accuracy: 0.9962 - val_loss: 0.1572 - val_accuracy: 0.9760 Epoch 99/100 624/624 [==============================] - 5s 9ms/step - loss: 0.0147 - accuracy: 0.9957 - val_loss: 0.1201 - val_accuracy: 0.9774 Epoch 100/100 624/624 [==============================] - 5s 8ms/step - loss: 0.0129 - accuracy: 0.9958 - val_loss: 0.2101 - val_accuracy: 0.9688
Plotting the train and validation accuracy
# Plot train and validation accuracy
plt.plot(history3.history['accuracy'])
plt.plot(history3.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# Evaluate the 3rd model's performance on the test data.
accuracy_model_3 = cnn_model_3.evaluate(test_images,test_class_encoded)
print(accuracy_model_3)
print('Accuracy of model # 3 on Test Data: ','{:.2f}%'.format(accuracy_model_3[1]*100))
82/82 [==============================] - 1s 5ms/step - loss: 0.0963 - accuracy: 0.9835 [0.09627310186624527, 0.9834615588188171] Accuracy of model # 3 on Test Data: 98.35%
CNN Model # 3 is a complex model with 93K trainable parameters and uses dropouts and batch normalization in the convolution layers and the dense layers using a large number of neurons compared to model # 2,but there is no significant improvement in accuracy(98.50% for Model # 2 on test data and 98.35% for Model # 3 on same test data).This is expected due to data issues we saw earlier and we will analyse the misclassifications for this model.
The Model # 3 seems to be generelizing better on test data but Model # 2 has a better recall score(99%) for parasitized which means its not missing the infected cells.It misclassified only 15 out of 1300 infected cells which is very good.
Generate the classification report and confusion matrix
pred_class = cnn_model_3.predict(test_images)
pred_class = np.argmax(pred_class,axis=1)
true_class = np.argmax(test_class_encoded,axis=1)
print(classification_report(true_class, pred_class))
# Plotting the heatmap using confusion matrix
cm = confusion_matrix(true_class, pred_class)
plt.figure(figsize = (8, 5))
sns.heatmap(cm, annot = True, fmt = '.0f', xticklabels = ['Uninfected', 'Parasitized'], yticklabels = ['Uninfected', 'Parasitized'])
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()
82/82 [==============================] - 0s 3ms/step
precision recall f1-score support
0 0.98 0.98 0.98 1300
1 0.98 0.98 0.98 1300
accuracy 0.98 2600
macro avg 0.98 0.98 0.98 2600
weighted avg 0.98 0.98 0.98 2600
Lets analyse the incorrect predictions for parasitized cell images
Parasitized_error = []
Uninfected_error = []
for i in range(len(true_class)):
if(true_class[i]==1 and (true_class[i] != pred_class[i])):
Parasitized_error.append(test_images[i])
if(true_class[i]==0 and (true_class[i] != pred_class[i])):
Uninfected_error.append(test_images[i])
fig = plt.figure(figsize=(10, 10))
for i in range(len(Parasitized_error)-3):
ax = plt.subplot(4, 5,i+1)
ax.axis('Off')
plt.title('Parasitized_error')
plt.imshow(Parasitized_error[i])
print('Parasitized predicted as Uninfected count:',len(Parasitized_error))
Parasitized predicted as Uninfected count: 23
Lets analyse the incorrect predictions for Uninfected cell images
# Analyse Uninfected predicted as parasitized
fig = plt.figure(figsize=(10, 10))
print('Uninfected predicted as parasitized count:',len(Uninfected_error))
for i in range(len(Uninfected_error)):
ax = plt.subplot(4, 5,i+1)
ax.axis('Off')
plt.title('Uninf_error')
plt.imshow(Uninfected_error[i])
Uninfected predicted as parasitized count: 20
It can be seen again that in the case of missclassified uninfections,the images does show the parasite infections similar to the actual infected image cells.This could confuse the model and impede the learning process.
backend.clear_session()
# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(train_images, train_class_encoded, test_size = 0.2, random_state = 42)
train_datagen = ImageDataGenerator(
height_shift_range=0.5,
horizontal_flip=True,
rotation_range=30,
zoom_range=[0.5,1.0]
)
val_datagen = ImageDataGenerator()
train_generator = train_datagen.flow(x_train, y_train, batch_size = 32, seed = 42, shuffle = True)
val_generator = val_datagen.flow(x_test, y_test, batch_size = 32, seed = 42, shuffle = True)
# Visualizing Augmented images using ImageDataGenerator
fig = plt.figure(figsize=(12, 8))
images, labels = next(train_generator)
for i in range(32):
ax = plt.subplot(4,8,i+1)
plt.imshow(images[i])
if(np.argmax(labels[i])==1):
plt.title('parasitized')
if(np.argmax(labels[i])==0):
plt.title('uninfected')
plt.axis('off')
plt.plot(8,8)
cnn_model_4 = Sequential()
cnn_model_4.add(Conv2D(32, (2,2), input_shape=(64, 64, 3), activation='relu', padding = 'same'))
cnn_model_4.add(MaxPooling2D(2,2))
cnn_model_4.add(Dropout(0.2))
cnn_model_4.add(Conv2D(32, (2,2), activation='relu', padding = 'same'))
cnn_model_4.add(MaxPooling2D(2,2))
cnn_model_4.add(Conv2D(32, (2,2), activation='relu', padding = 'same'))
cnn_model_4.add(MaxPooling2D(2,2))
cnn_model_4.add(Dropout(0.2))
cnn_model_4.add(BatchNormalization()),
cnn_model_4.add(Flatten())
cnn_model_4.add(Dense(128, activation='relu'))
cnn_model_4.add(Dropout(0.2))
cnn_model_4.add(Dense(64, activation='relu'))
cnn_model_4.add(Dropout(0.2))
cnn_model_4.add(Dense(64, activation='relu'))
cnn_model_4.add(Dense(32, activation='relu'))
cnn_model_4.add(Dense(2, activation='softmax'))
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
cnn_model_4.compile(loss = 'categorical_crossentropy', optimizer = 'Adam', metrics = "accuracy")
cnn_model_4.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 64, 64, 32) 416
max_pooling2d (MaxPooling2D (None, 32, 32, 32) 0
)
dropout (Dropout) (None, 32, 32, 32) 0
conv2d_1 (Conv2D) (None, 32, 32, 32) 4128
max_pooling2d_1 (MaxPooling (None, 16, 16, 32) 0
2D)
conv2d_2 (Conv2D) (None, 16, 16, 32) 4128
max_pooling2d_2 (MaxPooling (None, 8, 8, 32) 0
2D)
dropout_1 (Dropout) (None, 8, 8, 32) 0
batch_normalization (BatchN (None, 8, 8, 32) 128
ormalization)
flatten (Flatten) (None, 2048) 0
dense (Dense) (None, 128) 262272
dropout_2 (Dropout) (None, 128) 0
dense_1 (Dense) (None, 64) 8256
dropout_3 (Dropout) (None, 64) 0
dense_2 (Dense) (None, 64) 4160
dense_3 (Dense) (None, 32) 2080
dense_4 (Dense) (None, 2) 66
=================================================================
Total params: 285,634
Trainable params: 285,570
Non-trainable params: 64
_________________________________________________________________
Using Callbacks
my_callbacks = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=4)
Fit and Train the model
history4 = cnn_model_4.fit(train_generator,validation_data = val_generator, batch_size=32,shuffle=True, epochs = 40, verbose = 1, callbacks=my_callbacks)
Epoch 1/40 624/624 [==============================] - 27s 39ms/step - loss: 0.5594 - accuracy: 0.7039 - val_loss: 0.2130 - val_accuracy: 0.9235 Epoch 2/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3851 - accuracy: 0.8282 - val_loss: 0.1698 - val_accuracy: 0.9653 Epoch 3/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3504 - accuracy: 0.8464 - val_loss: 0.1788 - val_accuracy: 0.9770 Epoch 4/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3425 - accuracy: 0.8531 - val_loss: 0.1443 - val_accuracy: 0.9758 Epoch 5/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3384 - accuracy: 0.8513 - val_loss: 0.1668 - val_accuracy: 0.9702 Epoch 6/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3325 - accuracy: 0.8568 - val_loss: 0.1079 - val_accuracy: 0.9794 Epoch 7/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3368 - accuracy: 0.8549 - val_loss: 0.0997 - val_accuracy: 0.9802 Epoch 8/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3199 - accuracy: 0.8642 - val_loss: 0.0942 - val_accuracy: 0.9776 Epoch 9/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3223 - accuracy: 0.8630 - val_loss: 0.1154 - val_accuracy: 0.9794 Epoch 10/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3134 - accuracy: 0.8680 - val_loss: 0.0986 - val_accuracy: 0.9784 Epoch 11/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3154 - accuracy: 0.8671 - val_loss: 0.0816 - val_accuracy: 0.9802 Epoch 12/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3112 - accuracy: 0.8668 - val_loss: 0.0806 - val_accuracy: 0.9810 Epoch 13/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3047 - accuracy: 0.8709 - val_loss: 0.0684 - val_accuracy: 0.9806 Epoch 14/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3111 - accuracy: 0.8681 - val_loss: 0.0823 - val_accuracy: 0.9800 Epoch 15/40 624/624 [==============================] - 23s 37ms/step - loss: 0.3074 - accuracy: 0.8693 - val_loss: 0.0786 - val_accuracy: 0.9778 Epoch 16/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3040 - accuracy: 0.8694 - val_loss: 0.0725 - val_accuracy: 0.9804 Epoch 17/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3047 - accuracy: 0.8722 - val_loss: 0.0780 - val_accuracy: 0.9786 Epoch 18/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3096 - accuracy: 0.8675 - val_loss: 0.0849 - val_accuracy: 0.9768 Epoch 19/40 624/624 [==============================] - 24s 38ms/step - loss: 0.2985 - accuracy: 0.8742 - val_loss: 0.0771 - val_accuracy: 0.9796 Epoch 20/40 624/624 [==============================] - 24s 38ms/step - loss: 0.2997 - accuracy: 0.8743 - val_loss: 0.0743 - val_accuracy: 0.9788 Epoch 21/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3007 - accuracy: 0.8737 - val_loss: 0.0826 - val_accuracy: 0.9786 Epoch 22/40 624/624 [==============================] - 24s 38ms/step - loss: 0.3007 - accuracy: 0.8703 - val_loss: 0.0698 - val_accuracy: 0.9800 Epoch 23/40 624/624 [==============================] - 24s 39ms/step - loss: 0.2996 - accuracy: 0.8732 - val_loss: 0.0697 - val_accuracy: 0.9794
# Plot train and validation accuracy
plt.plot(history4.history['accuracy'])
plt.plot(history4.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# Evaluate the 4rth model's performance on the test data.
accuracy_model_4 = cnn_model_4.evaluate(test_images,test_class_encoded)
print(accuracy_model_4)
print('Accuracy of model # 4 on Test Data: ','{:.2f}%'.format(accuracy_model_4[1]*100))
82/82 [==============================] - 0s 4ms/step - loss: 0.0752 - accuracy: 0.9788 [0.07518284767866135, 0.9788461327552795] Accuracy of model # 4 on Test Data: 97.88%
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
pred_class = cnn_model_4.predict(test_images)
pred_class = np.argmax(pred_class,axis=1)
true_class = np.argmax(test_class_encoded,axis=1)
print(classification_report(true_class, pred_class))
# Plotting the heatmap using confusion matrix
cm = confusion_matrix(true_class, pred_class)
plt.figure(figsize = (8, 5))
sns.heatmap(cm, annot = True, fmt = '.0f', xticklabels = ['Uninfected', 'Parasitized'], yticklabels = ['Uninfected', 'Parasitized'])
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()
82/82 [==============================] - 0s 2ms/step
precision recall f1-score support
0 0.98 0.98 0.98 1300
1 0.98 0.98 0.98 1300
accuracy 0.98 2600
macro avg 0.98 0.98 0.98 2600
weighted avg 0.98 0.98 0.98 2600
Now, let us try to use a pretrained model like VGG16 and check how it performs on our data.
backend.clear_session()
# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(42)
import random
random.seed(42)
tf.random.set_seed(42)
from tensorflow.keras.applications.vgg16 import VGG16
## Loading VGG16 model
vgg16_model = VGG16(weights="imagenet", include_top=False, input_shape=(64,64,3))
vgg16_model.trainable = False ## Not trainable weights
from tensorflow.keras import layers, models
norm_layer = layers.BatchNormalization()
flatten_layer = layers.Flatten()
dense_layer_1 = layers.Dense(128, activation='relu')
drop_out_layer_1 = layers.Dropout(0.2)
dense_layer_2 = layers.Dense(64, activation='relu')
drop_out_layer_2 = layers.Dropout(0.2)
dense_layer_3 = layers.Dense(32, activation='relu')
drop_out_layer_3 = layers.Dropout(0.1)
prediction_layer = layers.Dense(2, activation='softmax')
cnn_model_5 = models.Sequential([
vgg16_model,
norm_layer,
flatten_layer,
dense_layer_1,
drop_out_layer_1,
dense_layer_2,
drop_out_layer_2,
dense_layer_3,
drop_out_layer_3,
prediction_layer
])
cnn_model_5.summary()
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
58889256/58889256 [==============================] - 0s 0us/step
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
vgg16 (Functional) (None, 2, 2, 512) 14714688
batch_normalization (BatchN (None, 2, 2, 512) 2048
ormalization)
flatten (Flatten) (None, 2048) 0
dense (Dense) (None, 128) 262272
dropout (Dropout) (None, 128) 0
dense_1 (Dense) (None, 64) 8256
dropout_1 (Dropout) (None, 64) 0
dense_2 (Dense) (None, 32) 2080
dropout_2 (Dropout) (None, 32) 0
dense_3 (Dense) (None, 2) 66
=================================================================
Total params: 14,989,410
Trainable params: 273,698
Non-trainable params: 14,715,712
_________________________________________________________________
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
cnn_model_5.compile(loss='categorical_crossentropy',optimizer=opt, metrics=['accuracy'])
my_callbacks = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=5)
# Pulling a single large batch of random testing data for testing after each epoch
x_test, y_test = val_generator.next()
history5 = cnn_model_5.fit(train_generator,validation_data=(x_test, y_test),shuffle=True, epochs = 100, verbose = 1, callbacks=my_callbacks)
Epoch 1/100 624/624 [==============================] - 28s 40ms/step - loss: 0.4931 - accuracy: 0.7532 - val_loss: 0.3481 - val_accuracy: 0.9062 Epoch 2/100 624/624 [==============================] - 24s 39ms/step - loss: 0.4417 - accuracy: 0.7901 - val_loss: 0.3844 - val_accuracy: 0.8438 Epoch 3/100 624/624 [==============================] - 24s 39ms/step - loss: 0.4231 - accuracy: 0.7989 - val_loss: 0.2554 - val_accuracy: 0.8750 Epoch 4/100 624/624 [==============================] - 24s 39ms/step - loss: 0.4138 - accuracy: 0.8014 - val_loss: 0.2745 - val_accuracy: 0.9062 Epoch 5/100 624/624 [==============================] - 24s 39ms/step - loss: 0.4039 - accuracy: 0.8121 - val_loss: 0.2453 - val_accuracy: 0.9062 Epoch 6/100 624/624 [==============================] - 24s 39ms/step - loss: 0.4046 - accuracy: 0.8097 - val_loss: 0.2167 - val_accuracy: 0.9062 Epoch 7/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3961 - accuracy: 0.8164 - val_loss: 0.1873 - val_accuracy: 0.8750 Epoch 8/100 624/624 [==============================] - 24s 38ms/step - loss: 0.4006 - accuracy: 0.8122 - val_loss: 0.2060 - val_accuracy: 0.9062 Epoch 9/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3960 - accuracy: 0.8145 - val_loss: 0.2798 - val_accuracy: 0.8750 Epoch 10/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3966 - accuracy: 0.8157 - val_loss: 0.2501 - val_accuracy: 0.9062 Epoch 11/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3929 - accuracy: 0.8202 - val_loss: 0.2019 - val_accuracy: 0.9062 Epoch 12/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3874 - accuracy: 0.8206 - val_loss: 0.1948 - val_accuracy: 0.9062 Epoch 13/100 624/624 [==============================] - 24s 38ms/step - loss: 0.3858 - accuracy: 0.8205 - val_loss: 0.2181 - val_accuracy: 0.9062 Epoch 14/100 624/624 [==============================] - 24s 38ms/step - loss: 0.3892 - accuracy: 0.8193 - val_loss: 0.2112 - val_accuracy: 0.9062 Epoch 15/100 624/624 [==============================] - 24s 38ms/step - loss: 0.3872 - accuracy: 0.8181 - val_loss: 0.2153 - val_accuracy: 0.9062 Epoch 16/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3812 - accuracy: 0.8220 - val_loss: 0.1954 - val_accuracy: 0.8750 Epoch 17/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3846 - accuracy: 0.8236 - val_loss: 0.2635 - val_accuracy: 0.9062 Epoch 18/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3783 - accuracy: 0.8264 - val_loss: 0.2543 - val_accuracy: 0.9062 Epoch 19/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3819 - accuracy: 0.8234 - val_loss: 0.2230 - val_accuracy: 0.9062 Epoch 20/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3806 - accuracy: 0.8231 - val_loss: 0.2342 - val_accuracy: 0.9062 Epoch 21/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3765 - accuracy: 0.8254 - val_loss: 0.1703 - val_accuracy: 0.9062 Epoch 22/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3788 - accuracy: 0.8256 - val_loss: 0.1910 - val_accuracy: 0.9062 Epoch 23/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3697 - accuracy: 0.8306 - val_loss: 0.1778 - val_accuracy: 0.9062 Epoch 24/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3706 - accuracy: 0.8261 - val_loss: 0.1696 - val_accuracy: 0.9375 Epoch 25/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3729 - accuracy: 0.8280 - val_loss: 0.1578 - val_accuracy: 0.9375 Epoch 26/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3708 - accuracy: 0.8279 - val_loss: 0.1634 - val_accuracy: 0.9688 Epoch 27/100 624/624 [==============================] - 24s 39ms/step - loss: 0.3732 - accuracy: 0.8285 - val_loss: 0.1306 - val_accuracy: 0.9375 Epoch 28/100 624/624 [==============================] - 24s 38ms/step - loss: 0.3775 - accuracy: 0.8230 - val_loss: 0.1291 - val_accuracy: 0.9375
# Plot train and validation accuracy
plt.plot(history5.history['accuracy'])
plt.plot(history5.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
What can be observed from the validation and train curves? We can see that the training accuracy is staying steady after 15 epochs but the validation accuracy shows variation of 2.5% to 5% and may generalise better if we increase the "patience" using callback,but the accuracy does not improve much.
# Evaluate the 5th model's performance on the test data.
#test_images_from_gen,test_labels_from_gen = next(test_generator)
accuracy_model_5 = cnn_model_5.evaluate(test_images,test_class_encoded)
print(accuracy_model_5)
print('Accuracy of model # 5 on Test Data: ','{:.2f}%'.format(accuracy_model_5[1]*100))
82/82 [==============================] - 2s 19ms/step - loss: 0.1826 - accuracy: 0.9277 [0.18263845145702362, 0.9276922941207886] Accuracy of model # 5 on Test Data: 92.77%
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
pred_class = cnn_model_5.predict(test_images)
pred_class = np.argmax(pred_class,axis=1)
true_class = np.argmax(test_class_encoded,axis=1)
print(classification_report(true_class, pred_class))
# Plotting the heatmap using confusion matrix
cm = confusion_matrix(true_class, pred_class)
plt.figure(figsize = (8, 5))
sns.heatmap(cm, annot = True, fmt = '.0f', xticklabels = ['Uninfected', 'Parasitized'], yticklabels = ['Uninfected', 'Parasitized'])
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()
82/82 [==============================] - 1s 14ms/step
precision recall f1-score support
0 0.92 0.94 0.93 1300
1 0.94 0.92 0.93 1300
accuracy 0.93 2600
macro avg 0.93 0.93 0.93 2600
weighted avg 0.93 0.93 0.93 2600
backend.clear_session()
# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(42)
import random
random.seed(42)
tf.random.set_seed(42)
# Function to apply gaussian blur on images
gbr_train_images = []
gbr_train_class = []
def gaussian_blur(images,classes):
gbr_array = []
class_array = []
for i in range(len(images)):
gbr = cv2.GaussianBlur(images[i], (5, 5), 0)
gbr_array.append(gbr)
class_array.append(classes[i])
return gbr_array,class_array
gbr_train_images,gbr_train_class = gaussian_blur(train_images,train_class_encoded)
gbr_train_images = np.array(gbr_train_images)
gbr_train_class = np.array(gbr_train_class)
print(gbr_train_images.shape)
print(gbr_train_class.shape)
(24958, 64, 64, 3) (24958, 2)
my_callbacks = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=8)
# This callback will stop the training when there is no improvement in
# the loss for three consecutive epochs.
history_gblur = cnn_model_2.fit(gbr_train_images,gbr_train_class, batch_size = 64, shuffle=True, validation_split = 0.2, epochs = 100, verbose = 1, callbacks=my_callbacks)
Epoch 1/100 312/312 [==============================] - 5s 15ms/step - loss: 0.0244 - accuracy: 0.9914 - val_loss: 0.1820 - val_accuracy: 0.9515 Epoch 2/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0170 - accuracy: 0.9940 - val_loss: 0.2794 - val_accuracy: 0.9305 Epoch 3/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0202 - accuracy: 0.9926 - val_loss: 0.1714 - val_accuracy: 0.9483 Epoch 4/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0163 - accuracy: 0.9943 - val_loss: 0.1658 - val_accuracy: 0.9599 Epoch 5/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0138 - accuracy: 0.9949 - val_loss: 0.3543 - val_accuracy: 0.9273 Epoch 6/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0147 - accuracy: 0.9946 - val_loss: 0.2272 - val_accuracy: 0.9483 Epoch 7/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0119 - accuracy: 0.9960 - val_loss: 0.1546 - val_accuracy: 0.9681 Epoch 8/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0138 - accuracy: 0.9950 - val_loss: 0.1953 - val_accuracy: 0.9611 Epoch 9/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0124 - accuracy: 0.9956 - val_loss: 0.2002 - val_accuracy: 0.9571 Epoch 10/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0117 - accuracy: 0.9959 - val_loss: 0.3201 - val_accuracy: 0.9415 Epoch 11/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0112 - accuracy: 0.9957 - val_loss: 0.2570 - val_accuracy: 0.9423 Epoch 12/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0122 - accuracy: 0.9956 - val_loss: 0.2624 - val_accuracy: 0.9389 Epoch 13/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0098 - accuracy: 0.9964 - val_loss: 0.2870 - val_accuracy: 0.9471 Epoch 14/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0142 - accuracy: 0.9952 - val_loss: 0.1942 - val_accuracy: 0.9589 Epoch 15/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0076 - accuracy: 0.9976 - val_loss: 0.2921 - val_accuracy: 0.9509 Epoch 16/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0108 - accuracy: 0.9957 - val_loss: 0.2081 - val_accuracy: 0.9615 Epoch 17/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0114 - accuracy: 0.9961 - val_loss: 0.2230 - val_accuracy: 0.9495 Epoch 18/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0124 - accuracy: 0.9950 - val_loss: 0.3098 - val_accuracy: 0.9303 Epoch 19/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0095 - accuracy: 0.9966 - val_loss: 0.3632 - val_accuracy: 0.9303 Epoch 20/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0092 - accuracy: 0.9959 - val_loss: 0.3247 - val_accuracy: 0.9305 Epoch 21/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0103 - accuracy: 0.9962 - val_loss: 0.4458 - val_accuracy: 0.9299 Epoch 22/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0104 - accuracy: 0.9971 - val_loss: 0.3966 - val_accuracy: 0.9361 Epoch 23/100 312/312 [==============================] - 4s 14ms/step - loss: 0.0087 - accuracy: 0.9966 - val_loss: 0.3970 - val_accuracy: 0.9281
# Plot train and validation accuracy
plt.plot(history_gblur.history['accuracy'])
plt.plot(history_gblur.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# Evaluate the 2nd model's performance on the test data when trained on blurred images.
accuracy_model_gblur = cnn_model_2.evaluate(test_images,test_class_encoded)
print(accuracy_model_gblur)
print('Accuracy of model # 5 on Test Data: ','{:.2f}%'.format(accuracy_model_gblur[1]*100))
82/82 [==============================] - 0s 4ms/step - loss: 0.1941 - accuracy: 0.9535 [0.19414742290973663, 0.9534615278244019] Accuracy of model # 5 on Test Data: 95.35%
# Print the classification report and confusion matrix based on 2nd model's performance on the test data when trained on blurred images.
pred_class = cnn_model_2.predict(test_images)
pred_class = np.argmax(pred_class,axis=1)
true_class = np.argmax(test_class_encoded,axis=1)
print(classification_report(true_class, pred_class))
# Plotting the heatmap using confusion matrix
cm = confusion_matrix(true_class, pred_class)
plt.figure(figsize = (8, 5))
sns.heatmap(cm, annot = True, fmt = '.0f', xticklabels = ['Uninfected', 'Parasitized'], yticklabels = ['Uninfected', 'Parasitized'])
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()
82/82 [==============================] - 0s 3ms/step
precision recall f1-score support
0 0.99 0.91 0.95 1300
1 0.92 0.99 0.96 1300
accuracy 0.95 2600
macro avg 0.96 0.95 0.95 2600
weighted avg 0.96 0.95 0.95 2600
It is interesting to find that Gaussian blurring has improved the recall for infections as it misclassified only 8 out of 1300 with a recall % of 99%. Let us visualise the images and see the misclassified images.
Parasitized_error = []
Uninfected_error = []
for i in range(len(true_class)):
if(true_class[i]==1 and (true_class[i] != pred_class[i])):
Parasitized_error.append(test_images[i])
if(true_class[i]==0 and (true_class[i] != pred_class[i])):
Uninfected_error.append(test_images[i])
fig = plt.figure(figsize=(10, 6))
for i in range(len(Parasitized_error)):
ax = plt.subplot(2, 4,i+1)
ax.axis('Off')
plt.title('Parasitized_error')
plt.imshow(Parasitized_error[i])
print('Parasitized predicted as Uninfected count:',len(Parasitized_error))
Parasitized predicted as Uninfected count: 8
#Analyse few uninfected images that were predicteded as infected
fig = plt.figure(figsize=(10, 10))
print('Uninfected predicted as parasitized count:',len(Uninfected_error))
uninf_cnt = random.choices(Uninfected_error, k=20)
for i in range(len(uninf_cnt)):
ax = plt.subplot(4, 5,i+1)
ax.axis('Off')
plt.title('Uninf_error')
plt.imshow(Uninfected_error[i])
Uninfected predicted as parasitized count: 113
backend.clear_session()
# Fixing the seed for random number generators so that we can ensure we receive the same output everytime
np.random.seed(42)
import random
random.seed(42)
tf.random.set_seed(42)
# Function rgb2hsv to convert images from RGB to HSV
def rgb2hsv(images,classes):
hsv_array = []
class_array = []
for i in range(len(images)):
hsv = cv2.cvtColor(images[i], cv2.COLOR_BGR2HSV)
hsv_array.append(hsv)
class_array.append(classes[i])
return hsv_array,class_array
hsv_train_images,hsv_train_class = rgb2hsv(train_images,train_class_encoded)
hsv_train_images = np.array(hsv_train_images)
hsv_train_class = np.array(hsv_train_class)
print(hsv_train_images.shape)
print(hsv_train_class.shape)
(24958, 64, 64, 3) (24958, 2)
my_callbacks = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=20)
# This callback will stop the training when there is no improvement in
# the loss for 8 consecutive epochs.
history_hsv = cnn_model_2.fit(hsv_train_images,hsv_train_class, batch_size = 64, shuffle=True, validation_split = 0.2, epochs = 100, verbose = 1, callbacks=my_callbacks)
Epoch 1/100 312/312 [==============================] - 5s 15ms/step - loss: 0.6581 - accuracy: 0.6314 - val_loss: 0.9934 - val_accuracy: 0.0000e+00 Epoch 2/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6580 - accuracy: 0.6311 - val_loss: 1.0066 - val_accuracy: 0.0000e+00 Epoch 3/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6587 - accuracy: 0.6326 - val_loss: 0.9776 - val_accuracy: 0.0000e+00 Epoch 4/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6575 - accuracy: 0.6325 - val_loss: 1.0081 - val_accuracy: 0.0000e+00 Epoch 5/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6576 - accuracy: 0.6320 - val_loss: 1.0122 - val_accuracy: 0.0000e+00 Epoch 6/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6576 - accuracy: 0.6327 - val_loss: 0.9924 - val_accuracy: 0.0000e+00 Epoch 7/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6580 - accuracy: 0.6327 - val_loss: 0.9743 - val_accuracy: 0.0014 Epoch 8/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6570 - accuracy: 0.6331 - val_loss: 1.0128 - val_accuracy: 0.0000e+00 Epoch 9/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6556 - accuracy: 0.6336 - val_loss: 0.9983 - val_accuracy: 2.0032e-04 Epoch 10/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6555 - accuracy: 0.6339 - val_loss: 1.0281 - val_accuracy: 2.0032e-04 Epoch 11/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6630 - accuracy: 0.6334 - val_loss: 0.9944 - val_accuracy: 0.0000e+00 Epoch 12/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6622 - accuracy: 0.6330 - val_loss: 0.9790 - val_accuracy: 0.0000e+00 Epoch 13/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6606 - accuracy: 0.6335 - val_loss: 1.0063 - val_accuracy: 0.0000e+00 Epoch 14/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6560 - accuracy: 0.6334 - val_loss: 0.9831 - val_accuracy: 0.0000e+00 Epoch 15/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6562 - accuracy: 0.6338 - val_loss: 1.0198 - val_accuracy: 0.0000e+00 Epoch 16/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6557 - accuracy: 0.6342 - val_loss: 1.0092 - val_accuracy: 0.0000e+00 Epoch 17/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6558 - accuracy: 0.6341 - val_loss: 0.9887 - val_accuracy: 0.0018 Epoch 18/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6568 - accuracy: 0.6339 - val_loss: 1.0025 - val_accuracy: 2.0032e-04 Epoch 19/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6554 - accuracy: 0.6342 - val_loss: 0.9744 - val_accuracy: 0.0000e+00 Epoch 20/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6557 - accuracy: 0.6336 - val_loss: 1.0120 - val_accuracy: 0.0010 Epoch 21/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6583 - accuracy: 0.6337 - val_loss: 0.9728 - val_accuracy: 2.0032e-04 Epoch 22/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6574 - accuracy: 0.6332 - val_loss: 1.0042 - val_accuracy: 0.0010 Epoch 23/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6559 - accuracy: 0.6340 - val_loss: 1.0218 - val_accuracy: 6.0096e-04 Epoch 24/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6563 - accuracy: 0.6332 - val_loss: 1.0074 - val_accuracy: 2.0032e-04 Epoch 25/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6555 - accuracy: 0.6336 - val_loss: 0.9905 - val_accuracy: 6.0096e-04 Epoch 26/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6567 - accuracy: 0.6348 - val_loss: 1.0156 - val_accuracy: 0.0012 Epoch 27/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6567 - accuracy: 0.6351 - val_loss: 1.0207 - val_accuracy: 4.0064e-04 Epoch 28/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6543 - accuracy: 0.6352 - val_loss: 1.0003 - val_accuracy: 0.0016 Epoch 29/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6547 - accuracy: 0.6355 - val_loss: 1.0096 - val_accuracy: 0.0056 Epoch 30/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6536 - accuracy: 0.6368 - val_loss: 1.0022 - val_accuracy: 0.0030 Epoch 31/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6536 - accuracy: 0.6367 - val_loss: 0.9803 - val_accuracy: 0.0078 Epoch 32/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6533 - accuracy: 0.6364 - val_loss: 1.0135 - val_accuracy: 0.0054 Epoch 33/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6530 - accuracy: 0.6373 - val_loss: 0.9884 - val_accuracy: 0.0050 Epoch 34/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6513 - accuracy: 0.6382 - val_loss: 0.9952 - val_accuracy: 0.0080 Epoch 35/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6518 - accuracy: 0.6380 - val_loss: 0.9963 - val_accuracy: 0.0034 Epoch 36/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6553 - accuracy: 0.6385 - val_loss: 1.0103 - val_accuracy: 0.0124 Epoch 37/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6535 - accuracy: 0.6374 - val_loss: 0.9892 - val_accuracy: 0.0050 Epoch 38/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6505 - accuracy: 0.6401 - val_loss: 0.9995 - val_accuracy: 0.0046 Epoch 39/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6512 - accuracy: 0.6395 - val_loss: 1.0034 - val_accuracy: 0.0078 Epoch 40/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6495 - accuracy: 0.6401 - val_loss: 1.0059 - val_accuracy: 0.0058 Epoch 41/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6497 - accuracy: 0.6402 - val_loss: 0.9923 - val_accuracy: 0.0086 Epoch 42/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6484 - accuracy: 0.6407 - val_loss: 0.9902 - val_accuracy: 0.0074 Epoch 43/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6493 - accuracy: 0.6409 - val_loss: 1.0000 - val_accuracy: 0.0028 Epoch 44/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6460 - accuracy: 0.6423 - val_loss: 1.0118 - val_accuracy: 0.0112 Epoch 45/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6458 - accuracy: 0.6434 - val_loss: 0.9935 - val_accuracy: 0.0140 Epoch 46/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6464 - accuracy: 0.6426 - val_loss: 1.0148 - val_accuracy: 0.0108 Epoch 47/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6485 - accuracy: 0.6425 - val_loss: 1.0148 - val_accuracy: 0.0050 Epoch 48/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6475 - accuracy: 0.6421 - val_loss: 1.0253 - val_accuracy: 0.0100 Epoch 49/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6453 - accuracy: 0.6433 - val_loss: 1.0156 - val_accuracy: 0.0118 Epoch 50/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6407 - accuracy: 0.6461 - val_loss: 1.0352 - val_accuracy: 0.0110 Epoch 51/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6422 - accuracy: 0.6448 - val_loss: 1.0138 - val_accuracy: 0.0092 Epoch 52/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6417 - accuracy: 0.6460 - val_loss: 0.9968 - val_accuracy: 0.0070 Epoch 53/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6408 - accuracy: 0.6461 - val_loss: 1.0026 - val_accuracy: 0.0094 Epoch 54/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6401 - accuracy: 0.6463 - val_loss: 1.0106 - val_accuracy: 0.0150 Epoch 55/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6394 - accuracy: 0.6470 - val_loss: 1.0149 - val_accuracy: 0.0142 Epoch 56/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6390 - accuracy: 0.6480 - val_loss: 0.9988 - val_accuracy: 0.0110 Epoch 57/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6398 - accuracy: 0.6465 - val_loss: 1.0155 - val_accuracy: 0.0076 Epoch 58/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6406 - accuracy: 0.6471 - val_loss: 1.0130 - val_accuracy: 0.0122 Epoch 59/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6382 - accuracy: 0.6480 - val_loss: 1.0148 - val_accuracy: 0.0234 Epoch 60/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6374 - accuracy: 0.6476 - val_loss: 1.0298 - val_accuracy: 0.0096 Epoch 61/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6409 - accuracy: 0.6477 - val_loss: 0.9478 - val_accuracy: 0.0186 Epoch 62/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6411 - accuracy: 0.6478 - val_loss: 0.9892 - val_accuracy: 0.0120 Epoch 63/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6419 - accuracy: 0.6469 - val_loss: 0.9957 - val_accuracy: 0.0118 Epoch 64/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6398 - accuracy: 0.6475 - val_loss: 1.0031 - val_accuracy: 0.0102 Epoch 65/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6370 - accuracy: 0.6491 - val_loss: 1.0016 - val_accuracy: 0.0172 Epoch 66/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6348 - accuracy: 0.6503 - val_loss: 1.0142 - val_accuracy: 0.0154 Epoch 67/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6366 - accuracy: 0.6496 - val_loss: 1.0159 - val_accuracy: 0.0146 Epoch 68/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6367 - accuracy: 0.6494 - val_loss: 1.0159 - val_accuracy: 0.0102 Epoch 69/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6361 - accuracy: 0.6496 - val_loss: 1.0084 - val_accuracy: 0.0126 Epoch 70/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6343 - accuracy: 0.6504 - val_loss: 1.0181 - val_accuracy: 0.0094 Epoch 71/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6336 - accuracy: 0.6505 - val_loss: 1.0082 - val_accuracy: 0.0158 Epoch 72/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6356 - accuracy: 0.6501 - val_loss: 1.0125 - val_accuracy: 0.0124 Epoch 73/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6344 - accuracy: 0.6502 - val_loss: 0.9989 - val_accuracy: 0.0186 Epoch 74/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6332 - accuracy: 0.6505 - val_loss: 0.9932 - val_accuracy: 0.0204 Epoch 75/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6345 - accuracy: 0.6498 - val_loss: 1.0049 - val_accuracy: 0.0178 Epoch 76/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6315 - accuracy: 0.6515 - val_loss: 1.0251 - val_accuracy: 0.0132 Epoch 77/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6338 - accuracy: 0.6505 - val_loss: 1.0162 - val_accuracy: 0.0148 Epoch 78/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6306 - accuracy: 0.6517 - val_loss: 1.0219 - val_accuracy: 0.0154 Epoch 79/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6314 - accuracy: 0.6513 - val_loss: 1.0238 - val_accuracy: 0.0120 Epoch 80/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6305 - accuracy: 0.6510 - val_loss: 1.0113 - val_accuracy: 0.0168 Epoch 81/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6306 - accuracy: 0.6522 - val_loss: 1.0088 - val_accuracy: 0.0158 Epoch 82/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6300 - accuracy: 0.6518 - val_loss: 1.0477 - val_accuracy: 0.0158 Epoch 83/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6311 - accuracy: 0.6513 - val_loss: 1.0115 - val_accuracy: 0.0196 Epoch 84/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6299 - accuracy: 0.6519 - val_loss: 1.0185 - val_accuracy: 0.0166 Epoch 85/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6286 - accuracy: 0.6525 - val_loss: 1.0363 - val_accuracy: 0.0172 Epoch 86/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6295 - accuracy: 0.6517 - val_loss: 1.0311 - val_accuracy: 0.0214 Epoch 87/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6287 - accuracy: 0.6518 - val_loss: 1.0692 - val_accuracy: 0.0186 Epoch 88/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6285 - accuracy: 0.6523 - val_loss: 1.0210 - val_accuracy: 0.0192 Epoch 89/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6289 - accuracy: 0.6524 - val_loss: 1.0209 - val_accuracy: 0.0206 Epoch 90/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6298 - accuracy: 0.6521 - val_loss: 1.0363 - val_accuracy: 0.0170 Epoch 91/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6281 - accuracy: 0.6526 - val_loss: 1.0005 - val_accuracy: 0.0264 Epoch 92/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6274 - accuracy: 0.6526 - val_loss: 0.9848 - val_accuracy: 0.0248 Epoch 93/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6279 - accuracy: 0.6524 - val_loss: 1.0678 - val_accuracy: 0.0182 Epoch 94/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6271 - accuracy: 0.6526 - val_loss: 1.0180 - val_accuracy: 0.0214 Epoch 95/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6263 - accuracy: 0.6529 - val_loss: 1.0306 - val_accuracy: 0.0174 Epoch 96/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6271 - accuracy: 0.6530 - val_loss: 1.0390 - val_accuracy: 0.0170 Epoch 97/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6267 - accuracy: 0.6529 - val_loss: 1.0595 - val_accuracy: 0.0120 Epoch 98/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6261 - accuracy: 0.6534 - val_loss: 1.0704 - val_accuracy: 0.0180 Epoch 99/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6254 - accuracy: 0.6538 - val_loss: 1.0700 - val_accuracy: 0.0258 Epoch 100/100 312/312 [==============================] - 4s 14ms/step - loss: 0.6252 - accuracy: 0.6540 - val_loss: 1.0418 - val_accuracy: 0.0158
# Evaluate the 2nd model's performance on the test data when trained on HSV images.
accuracy_model_hsv = cnn_model_2.evaluate(test_images,test_class_encoded)
print(accuracy_model_hsv)
print('Accuracy of model # 2 on Test Data: ','{:.2f}%'.format(accuracy_model_hsv[1]*100))
82/82 [==============================] - 0s 4ms/step - loss: 0.9655 - accuracy: 0.5000 [0.965542733669281, 0.5] Accuracy of model # 2 on Test Data: 50.00%
What observations and insights can be drawn from the confusion matrix and classification report? Choose the model with the best accuracy scores from all the above models and save it as a final model.
The final model of choise is the CNN model No 2.The model gave an accuracy of 98.38% on test data and produced a recall of 99% for parasitized and 98% on uninfected image detection which indicate that the model is more accurate in predicting the infections than misclassifying the uninfections.
The CNN model No 4 that uses data augmentation is found to generalise well with training and test data but has a recall of 97% and slighly less accurate than Model No 2.This model is also a model of choice due to the fact that the variance on test performance is better than all other models but recall needs to be improved.
The Model No 5 used vgg16 for transfer model,but did not do a good job in providing the accuracy of other simpler CNN models and takes long time to train.
Can the model performance be improved using other pre-trained models or different CNN architecture? You can try to build a model using these HSV images and compare them with your other models.
What are the most meaningful insights from the data relevant to the problem?
How do different techniques perform? Which one is performing relatively better? Is there scope to improve the performance further?
What model do you propose to be adopted? Why is this the best solution to adopt?
The dataset provided had enough samples for both classes eventhough the uninfected dataset is slightly less than the infected dataset. I evaluated the precision and recall for both classes to study if there is significant variation in the score due to this imbalance,but saw a difference of only 1%.This is also validated by the f1-score of 98%.
The dataset was easy to train faster using simple CNN models without much complexity and the "loss" got reduced much faster after each epoch and started to provide very good accuracy during the early stages of the training process.The parasite is mostly visible in almost all of the infected cells.
I analysed the best model's predictions that went wrong in the test dataset for both parasitized and uninfected classes and found that the test images that are truly labelled as uninfected had traces of parasites on the images and did not look very clean to be classified as uninfected.The differences could have an impact on the misclassification of infected images as well.
For this problem, we want to reduce the number of false negetives for infections as it is dangerous to misclassify an infection as uninfected and the patient could become very vulnerable.I therefore want to recommend a model that gives the best recall for infections and very good overall accuracy and also that which generalise well with test dataset. The model has an f1-score of 98% which is very good and there is little variation between precision and recall.
I trained several CNN models and found a model (CNN Model No:2) that gave the best recall for infections and also gave a good score for accuracy/f1-score.I also used gaussian blurring and converted the training images to HSV to evaluate the impact of image preprocessing on the model that gave best performance.
I see that the recall score for infections always improve when using gaussian blurring but it does not give a good accuracy as it misclassifies lots of uninfections as infected.
Converting the images to HSV, did very poor on evaluation and test datasets as it increased the "noise" on the image and the model is not able to train well.
It will be necessary to retrain the models when more data becomes available and the model could learn from new data and can furthur improve the accuracy and recall score.We looked into the test images that were misclassified to better understand the infections and uninfections and investigate if there were any errors made in the data sampling process.It was found that the images that were truly classified as uninfected did had "patches" that may indicate the presence of a parasite and vice versa.
It is important to discuss the findings with our stakeholders to get more clarity about the images that were used for training as we know if we feed "garbage" in, we get "garbage" out.If we are sure about the quality of the data,we can use better image pre-processing techniques and data augmentation to get accurate predictions with a much simpler model.